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CORRELATION ANALYSIS AS A MEANS OF STUDYING 
CONTRIBUTIONS OF CAUSES 


by 


Walter S. Monroe 
University of Illinois 


and 


D. B. Stuit 


Carleton 


INTRODUCTION 


When a trait or condition has been dem- 
oistrated to be a cause of a second trait 
rr condition, a measure of the magnitude of 
the contribution is frequentlv desired. 
jnen paired measures of the two traits or 
conditions are available for a representa- 
tive population, correlation analysis is 





commonly emploved for this purpose. The 


measures of the effect are desicnated as 


the dependent variable and those of the 
cause, the independent variable. Tech- 
niques of correlation analysis have been de- 
veloped also for situations in which there 
are two or more independent variables. 

If Xo and x, represent correlated va- 
riables, expressed in terms of deviations 
from their respective means, it is possible 
to explain the existence of this relation- 
ship in terms of a common factor. This the- 
orem+ means that the two variables may be 
analyzed as follows: 


Xo = CoGo, * So 
Xi = Cy@o, * 0, 


In these equations c, and c, are constants; 
®>o and e, are factors which are uncorrelated 





with each other and with aoe,; 4, is a fac- 


College 


tor whose macnitude for any pair of values 
of x, and x, is the same, or if the nature 
of x, and x, is such that this condition is 
not reasonable, a, designates two varia- 
bles that are perfectly correlated. Ry a 
proper choice of units, c,. can be reduced 
to unity. Hence, we may deal with the fol- 
lowing analvsis. 


Xo = 4o, * Go 
X, = Cy, * 6, 


As implied in the preceding paragraph, 
a causal variable is typically complex but 
it may be thought of as analyzable into two 
uncorrelated sub-variables, one (a,,) repre- 
senting the contribution and the other being 
uncorrelated with the effect or dependent 
variable. If the latter, represented by e,, 
is zero, the independent variable x, is de- 
scribed as contributing itself completely to 
Xo and may be designated as a component va- 
riable. The uncorrelated sub-variables are 
spoken of as factors and a,, is called the 
common factor. 

It is not possible to determine the 
values of the factors of two civen variables 
as indicated, but the meaning of the anal- 
ysis may be illustrated by taking sums of 
the corresponding values of uncorrelated 








l. For proof of this theorem, see 


Kelley, T. L. Crossroads in the Mind of Man Stenford, California: Stanford University Press, 1928, p. 38. 
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variables! as in Table I. If X, is thought | 
of as a cause of X,, its contribution*®is ob- 
viously represented by A,. Since the ratio 
of A, to X, varies and is dependent upon 


the location of the zero points, it has no 


TABLE I | 


ILLUSTRATIONS OF THE VALUES OF TWO CORRELATED 
VARIABLES OBTAINED BY ADDING THE CORRESPOND- 
ING VALUE; F UNCORRELATED SUB-VARIABLES 


ae 





= ian a 5 | 








Xo = A, | + Eg |} X, A, | + &, 
25 = 16 + 9 27 = 16 + ll 
l Li + 3 20 | Li + 3 
2% L6 + 6 26 é t LO 
24 = Lf + 9 20 sz lf + 5 
3i= 16/+ 7] 2ef= le;+ 7 
Si = 16/;/+ 7) 2,42 16/+ 7 
is | + lof el] = i5]/+ 6 
7 19 + 3} 29 i + 10 
{ 12 + 8 13 = 12 + 6 
oe 15 + 9] 22 = 138 + 9 
17 = 9 + 8 17 = 9 + 8 
23 14 + 9 18 = 14] + 4 
23 = 7 + 6 25 5 = 17 | + 8 
15 7 + 8 16 : a 9 
22 = 16 + 6 24 = 16 + 8 








meaning as a measure of the contribution of 
X, to X,. If the variables are expressed 
in deviation form, the situation is not im- 
proved. An approach to a meaningful meas- 
ure can be made by thinking of the standard 
deviation of the distribution of the values | 
of a variable as a measure of its "magni- 
tude." From this point of view, the con- 
tribution of an independent variable that 
contributes itself completely to the de- 
pendent variable would be measured by the 
ratio of its standard deviation to the 
standard deviation of the dependent varia- 
ble. However, to facilitate algebraic 
treatment, the standard deviation squared, 
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called the variance, is employed in prefer- 
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ence to the first power. The contributio: 
of X, to X, can then be described in ten 
of variance. Representing the variance of 
ok 

A, by o4, the variance ratio -— is a meas- 
So 

ure of the contribution of X, to X,. Hence, 
the problem of determinine the contribution 
of a given independent variable to a giver 
dependent variable may be interpreted 

that of determining the ratio of the vari- 
ance of the common factor to the variance 


of the dependent variable. 


TO VARIANCE RATIO 


If X, = a), * @ and X, = C,a,, + e, 
=X.x 
are substituted in the formula r,, 


» ’ 
Nooc, 


the expression thus obtained may be simpli- 
fied to 
2 oS o- 
o 
2 . Cy, eos 
Poa 2 2 + 2 cae 2 < 
Gao, Pe, C1 S84, * Ce, 








The first of the two fractions in the rirht- 
hand member is the variance ratio. For a 
riven value of Yro,, the value of this ratio 
is a minimum when the second is unity, 
i.e., when og, = O and x, is a component of 
Xo. When aq, # 0, i.@., when x, is not a 
component of X., the value of the variance 
ratio is larger. It is equal to r,, when 
Oc, = Cide,. For other values of oe,, the 
value of the variance ratio can only be es- 
timated unless certain supplementary data 
are available.® For a given value of the 
variance ratio, the value of r,, decreases 
aS Oe, increases. 


l. The values of these uncorrelated variables were secured by counting the number of heads or tails resulting from 
tosses of collections of coins. For example, the values labeled A, are the counts of heads or tails resulting from 
tosses of thirty coins. As a means of eliminating the effect of sny imperfections in the coins, heads were counted 
for the first fifteen tosses, tails for the next fifteen, und so on. For use in the calculations described later, 


counts of one hundred tosses of each collection were made. 
2. The reader's attention is called to the fact that the srguments relating to correletion énelysis cre in terms of 


Table I gives only illustrative values. 


deviation measures. The data derived by counting tosses of coins were not reduced to deviation measures because 
they are in terms of the same units and expressed from absolute zero points. Hence, the writers sre justified in 
using these raw measures as if they were deviation measures. Throughout the article, large letters will be employed 
to denote raw measures and small letters will be used for the deviation measures. 

5. For a description of the technique, see Tryon, R. C. "The Interpretation of the Coefficient of Correlation", 


Psychological Review, 36:425-24, September, 1929. 





Walter S. Monroe and D. B. 


PARTIAL CORRELATION AS A MEANS OF 
ELIMINATING THE EFFECT OF FACTORS 
OF HETEROGENEITY 


The correlation hetween two variables 
is affected by the variability of the popu- 
lation with reference to other variables. 

r example, the correlation between test 
sores in two school subjects such as 
xrithmetic and silent readine will vary 
vith the variability of the population rel- 

ative to intelligence test scores. Usually 
the correlation between two variables is 
alculated for a population that is hetero- 
reneous with reference to one or more other 
traits, and frequently the correlation is 
esired for a population that is homogene- 

in certain respects. The technique of 

partial correlation represented by the for- 
mula 


Tor = Toe Tie 


2 r — ; -——-—— 
: 3 9 2 
Vl- resi - Tie 


1s been proposed as a means of obtaining a 


coefficient of correlation for a homoreneous 


population from the data collected from a 
population heterogeneous with reference to 

measured trait. It has been pointed out? 
that partial correlation may fail to yield 
this result. By employing variables whose 

1lues are sums of uncorrelated sub-varia- 
as illustrated in Table I, the desired | 
sorrelation may be calculated directly as 
well as by means of partial correlation. 
Hence, we have a means of testing the oper- 
ation of this technique. The results ob- 
tained by means of the partial correlation 
formula and by direct calculation are given 
for several factor patterns. 


bles 





- Burks, B. S. 
Society for the Study of Education, Pert I 
p. 12-15). 


See also: 
Burks, B. S. 
chology, 17:532-40, 625-30, November, December, 1926. 


ituit 


Illustration 1. 
Xo = A, + Ag + As + 
X, = A, + Ag + E, 
Xe = A, 
Xs A, 


A, + Eo 


Toa-e 


+ As ss Ee Toa-3 


By direct calculation, the correlation 


between 


A. + As + A, + Eo and A, + E, is 200 
Illustration 2. 
X, = A, + A, 
X, =A, + As 
Xe A; + Ag 


+ B, 
+ E, 
+ Ey 
By partial correlation formula 

023. 
By direct calculation, the correlation 


Toi-2 


| between A, + E, and A, + Ey is -.025. 


Illustration 3 

= A, + Az + Ay + 

= A, + A, + A; + 

= A, 

Z®A, +A, + Ag +E 
A, + SA, + SA, 
Illustration 4. 

Xo = A, + A, + EQ + 
Xy As + Ag - Eo - 
X2 °o 4A, 

X; ° A. 

X4 o + Ag + Ay 


Ey 
Ey 
Toaez = ~-201 
= -,251 


= -,254 


Toi.3 
Toi-s 
By direct calculation, the correlation 
between 
Ay a Ae + E, and As + A, = E, is -.139 
The explanation of these discrepancies 
is apparent when the derivation of the 


"Statistical Hazards in Nature-Nurture Investigations", Twenty-Seventh Yearbook of the National 
(Bloomington, Tllinois: 


Public School Publishing Company, 1928, 


"On the Inedequacy of the Partir] end Multiple Correlstion Technique", Journal of Educational Psy- 
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formula for partial correlation is examined. 
The partial coefficient, r.,.2, represents 
the correlation between? 

Go 


(X, = Pozo, X2) 


and 


(x => x5) 
a * Fae o, *? 
which are errors of estimate.* If X, is a 
component of X,, that is, contributes it- 
self completely to X,, and X, and X, are 











expressed in terms of equivalent units, 


0 
Pos —_ becomes unity and the first error of 
2 
estimate can be expressed simply as X, —- X,. 
Similarly, if X, is a component of X,, the 





second error of estimate can be written 

X, - Xg. In this case then, the partialling 
out of X, from X, and X, is a matter of sim 
ple subtraction, and the obtained coeffi- 
cient is a measure of the correlation be- 
tween the remainders or residuals. If the 








variable partialled out is not a component, 
the quantity subtracted is only a best es- | 
timate of what it is desired to remove and 
the remainder is only an estimate of what 
the investigator is attempting to obtain. | 

When the variables are defined as in 
Illustration 1, the value of ro;.2 obtained | 
by applying the partial correlation formula 
is the correlation between 


((A, + Ag + As + Ay + Eo) 


0 
= Toe 5 (A, + As + E,)) 


and 


0 
((A, + A, + E,) = Tis 3, (A, + As + Eg)). 


In other words, all of the A, is not removed 





o 0 
since the terms Tog, and ry, = are equal 
2 2 
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to unity only when X, contributes itself 
completely to Xo and X,. The result is 
that the obtained variables are not perfect- 
ly homogeneous with respect to A, and in ad- 
dition have been made heterogeneous with re- 
spect to E,. Hence, in employing the par- 
tial correlation technique, if the desired 
results are to be obtained, one must be 
certain that the variable partialed out 
consists only of one or more components 
wnich are known to be present in both X, 
and X,. If X, contains factors not included 
in X, and X,, only an approximate removal 
of X, is accomplished. 

The usual interpretation of a coeffi- 
cient of partial correlation assumes a fac- 
tor pattern of the type 


Xo = 4, + a, * a, 
X, =a, +a, + a, 
Xe = 4, 


Usually x, is not a component of the other 
two variables and the calculated values of 
Toe and r,, are less than they would be if 
X2 were a component. Suppose r,, = .70, 
Toe = -50, r,, = .40 for a factor pattern 
of the type given above. Then r,,.2. = .59 
is a measure of the actual net correlation 
between x, and x,, i.e., the residue of 
correlation after the effect of x, has been 
eliminated. If, however, x, includes a 
factor uncorrelated with x, and x,, the 
calculated values of r,, and r,, will be 
less than the corresponding coefficients 
for the component variable. These cannot 
be calculated, but for purposes of illus- 
tration we may take r,, = .65 and 

Y,, = .60. For these values fo... = .5l. 
This result is indicative of the effect of 
an uncorrelated factor in the variable 
partialed out. The magnitude of the ef- 
fect upon the coefficient of partial cor- 
relation varies, and in the absence of 








l. This concept of partial correlation is found in the work of Yule who developed the technique. See: 
Yule, G. Udmy. An Introduction to the Theory of Statistics. London: Charles Griffin and Company Ltd., 1917, p. 256. 





For a more recent expression of the idea, see: 


Dunlap, J. W. and Cureton, E. E. "On the Analysis of Causation", Journal of Educational Psychology, 21:664-65, 


December, 1950. 





- Errors of estimate are those involved in estimating the walues of one variable from those of another by the use of 
the regression equation. The word "residuals", employed by Yule, appears to be « more appropriate term for the 


above expressions, but “errors of estimate" is more generally employed at the present time. 
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information concerning the factor pattern 
involved, dependable estimates cannot be 
made. 

It is surprising that in spite of the 
fact that Yule in his early work made clear 
wnat partial correlation actually accom- 
plishes, many writers have made misleading 
statements concerning it and have employed 
it in situations in which the technique 
failed to accomplish the desired end, even 
thouch the writer thought it did so. Buck- 
incham! makes the statement that the par- 
tial correlation coefficient measures the 
correlation between two things after the 
influence of others has been taken away. He 
then goes on with an example in which he 
obtains the correlation between teachers! 
salaries and teaching ability, holding con- 
stant or removing the effect of profession- 
al training. Dunlap and Cureton® state 
that the coefficient of partial correlation 
measures the correlation between that part 
of a variable which is uncorrelated with 
one or more others and that part of a sec- 
ond variable which is also uncorrelated 
with these others. Various other writers 
have made similar statements. If the va- 
riable partialed out is not a component, as 
is usually the case, the obtained coeffi- 
cient of partial correlation is a measure 
of the correlation between best estimates. 
Hence, an investigator should be cautious 
in the use of the technique of partial cor- 
relation and in interpreting results ob- 
tained by its use. 

It seems unlikely that partial corre- 
lation will accomplish the desired result 
when the variable partialed out consists of 
test scores or age scores derived from 
them. Such variables involve variable er- 
rors of measurement which will constitute 
an uncorrelated factor. The effect of this 
factor may be materially reduced, if not 
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eliminated, by making corrections for at- 
tenuation, but it is likely that test 
scores involve other uncorrelated factors. 
The statement has been made that it is dif- 
ficult to conceive of any variable other 
than chronological age that can be satis- 
factorily partialed out, and partial corre- 
lation will not yield the desired result 
in this case unless the relationship with 
the other variable is linear. 

Other systems of partial correlation 
have been proposed,® but their application 
is subject to similar limitations. 


CONTRIBUTIONS FROM TWO OR MORE MEASURED 
CAUSES: MULTIPLE REGRESSION AND PATH 
COEFFICIENTS 


It is frequently desired to secure 
measures of the contributions of two or 
more causes to a given effect or dependent 
variable. A basis for accomplishing this 
is an algebraic equation which expresses 
the dependent variable as a linear function 
of the independent variables given as 
causes. An approach to an understanding 
of the procedure may be made by consider- 
ing first the case of two uncorrelated 
causal variables. If x, = xX, + X,, and x, 
and xX, are uncorrelated, the determination 
of the contributions of each of the inde- 
pendent variables to the variance of the 
dependent variable is very simple.* Since 
the variables are expressed as deviations 
from their respective means, 


-_— =xS 7 (x, + X2) 
N N 
= Ux; + 2x,x, + X35) 
N 


2IX,X2 





ar) 





=Xé 


x ‘ 
N N N 








1. Buckingham, B. R. 


"Partiel Correlation", Journal of Educational Research, 7:544-49, April, 1923. (An editorial.) 





2. Dunlap and Cureton, op. cit., p. 665. 
5. Dunlap, J. W. and Cureton, E. E., op. cit., pp. 665-72. 


The origin of semi-partial correlation dates back to the early work of Spearman. See: 
Spearman, C. "The Proof and Measurement of the Association between Two Things", American Journal of Psychology, 


15:94, 1904. 





The form of the formula for three variables which is given by Dunlap and Cureton was also proposed by Franzen. 


See: 
Franzen, Raymond. 


"A Comment on Partial Correlation", Journal of Educational Psychology, 19:194-97, March, 1928. 





- In this very simple illustration, x is assumed to be completely determined by x, and x,. 


Also, x, and x, are con- 


sidered to be in terms of equivalent units. The arguments here given may be extended to the case in which the de- 


pendent variable is the weighted sum of its components. 





‘ 


| 
| 
| 
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ince x, and X, are uncorrelated, =X,X, = 

and we have o4f = ci + os. Hence, the per 

cent of the variance of x, contributed by 
2 2 


0} Oz 
x, is given by the ratio-;. Similarly, -—; 
Oo Co 


" 
Th 


gives the per cent contributed by Xa. 


0; 
value of the variance ratio is equal to 
Oo 
; o? 
Yo, and the variance ratio -—~is equal to 
Oo 


This development may be extended to 
any number of uncorrelated component inde- 
pendent variables one of which may repre- 
sent unmeasured causes. In the typical 
sase, however, the independent variables 
are correlated and are not components, i.e., 
jo not contribute themselves completely to 
the dependent variable. This means that 
the dependent variable cannot be precisely 
expressed as the weighted sum of the inde- 
pendent variables and unmeasured causes. 
Hence, the equation developed will be only 
an approximate expression of the relation- 
ship. 

The expression, 

Der.s Xa * DogeaXs 


gives the best estimate of x, that can be 
obtained from the independent variables, x, 
Hence, we use 


; 
and Xee 


Xo = Doi.2X, + D es-3 Xe +t U 


as the best linear expression of the rela- 
tionship. In this equation bo.2 and Doe.: 
are the ordinary regression coefficients. 
In terms of the symbolism of the regression 
equations, U = Xo - Xo. This means that the 
term u will include errors due to the use | 
of the terms Do,.2 X;, and Dog... X2 in addi- 
tion to unmeasured causes. In the develop- | 
ment which follows, u will be considered to | 
be uncorrelated with x, and x,. When this | 
is not true, an additional approximation is 
introduced, . 

oo =Xo = E(Dor-2 Xitb 021 X, *u) = £(Doa-2 xX +b 2, 


N N 


2 2 2 
- Doa-2 EX: y Dees Xe 4 2Dor-2 Do2.a EXiXe 4 
N N N 








"rnaT ory TON 
\ iT ION 
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SA 


Since uv is assumed to be uncorrelated wit! 


| either x, Or Xz, £X,uU and £xX,u are equal t 
| zero, the last two fractions in the above 


equation disappear. The remaining product 
term may be written as follows: 


2063.0 D og.3 TXyXe IX.X>- 
we = 2de3.2 Doz. 7 ——= 0.102 
N No,02 
= 2dor.2 Do2.1 120192 
Thus we have 
2 2 2 2 ‘ 
Oc = Doy.201 *DG2.202 * 201.201 Doe.i02 Pict 


The term bor-2 0; represents the direct con- 
tribution of x,, the term b§,.,02 represent: 
the direct contribution of x2, and the term 


201.201 Doz-1 O2 Pig 1S a measure of the joint 


contribution of x, and x, to X,. The last 
term, of represents the contribution of un- 


| measured causes and the errors due to the 


approximations introduced. The direct and 
joint contributions of x, and X2 as well as 
the contribution of u, may be expressed in 
per cents by dividing both sides of the 
equation by 0%. 




















2 2 2 
Oo 2 2 C2 
1 = —# = Doa-a + Déa-2. —3 
oO; 1 o2 2-2 42 
2 
Op O2 Oy 
+ 2b — Rea he * D 
O1*2 & 02-2 GS 12 eA 








2 
The term bo,.2 + is the square of 
° 





the corresponding Beta coefficient of the 
multiple regression equation and may be 


2 Ce 
written 6... Likewise, bdo., oe may be 


written 65,.,, and the term representing 
the joint contribution of x, and x, may be 
written 26,,.2 Boo2.. Tige Hence, the equa- 
tion may be written 


Ou 


2 2 
l= Boi. 2 + Bos.3 + 2Boa.2 Bo2.a Tie + Ou 


11 XS *2Doa.2 Dog., Xi X2t +2Dor-2 XaUt2d o 2-2 X2U) 


N 





EU 4 2dows EXU 4 2Dos.2 EX2U 
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is equation may be expressed in terms of 
following symbols. 


= Gor-2 = Goes = 2Por-2 Poa-aTaa * Ay 
= dor-2 doz. G ois + dou 


he term d,,..2 is read the coefficient of 
iirect determination of x, with respect to 
, Goeg., iS the coefficient of direct de- 
termination of x, with respect to x,, and 
= is the coefficient of joint determina- 
on of x, and x, with respect to x,. If 
term dormer 18 written 2po.2 Doo: Mie » 
Symbol Do,.2 designates the path coef- 
icient? connecting x, and x, and the sym 
bol Dog-, designates the path coefficient 
nnectinz x, and X,. 
The zeneral equation in terms of coef- 
cients of determination for n independent 
ariables is: 


= dor.as...n * Goeg.ars...n + ooo 


Gon.a2...(n-1) + d oid-3e...n 


+ +d 


+ Goi3.24...n eee 


MgccA Jesek Jose 


d. 

Ou 
In the next to the last term, i takes any 
ralue from 1 to n, and j takes any value 


from 1 ton other than i. There will be as 


|many terms of joint determination as there 


are possible pairs of independent variables. 
These coefficients of joint determination 
may be expressed in terms of the coefficient 


| of correlation between the two independent 


| variables and the path coefficients connect- 


ing them with the dependent variable. For 
example, 


- 
G oya-ae...n = 2Piz2 Dor-23...nPoa-as...n 


The values of the coefficients of de- 


| termination do9:.05,...5 » Weseas...e » 


1 oyd.ae...n » @tC., may be calculated from the 
data. The value of do, is obtained by sub- 
tractinge the sum of the known coefficients 

of determination, 1.@., Gor.23...n » Woe-13...n> 


4 
q 


Ye ver , etc., from unity. 


The coefficients of determination may 


| be obtained from the beta coefficients of 
'the regression equation or they may be cal- 


culated by means of Wright's method of path 


| coefficients. The fundamental theorem of 


Wright's method® may be stated as follows: 
siven x,, the dependent variable, and x,, 


| Xe, Xge-eX, independent variables, the co- 
| efficient of correlation between x, and any 
| of the independent variables or between any 


two of the independent variables, is equal 
to the path coefficient connecting the two 


| variables plus the sum of the products of 


the path coefficients alonr all paths of 





. A path coefficient is defined as the ratio of that part of the standard deviation of a variable which is due to an- 
other variable to the total standard deviation of the variable. In other words, a path coefficient represents the 
ratio of the estimate of that part of the standard deviation of the dependent variable which is due to another va- 
riable to the total standard deviation of the dependent variable. Thus, po,.2 = bo,.29) as indicated above, and the 
term “ag which was written as the Beta coefficient €:.0 might also be written “© p%, 2; hence Bo... = poi.2- 


Gg, 
For the development of path coefficients, see: 





Wright, Sewall. "Correlation and Causation", Journal of Agricultural Research, 20:557-85, Jenuary, 1921. 
Wright, Sewall. "The Theory of Path Coefficients", Genetics, 8:256-55, May, 1923. 

For further proof of the identity of path coefficients and Beta coefficients, see: 

Kelly, E. L. "The Relationship between the Techniques of Partial Correlation and Path Coefficients", Journal of 


Educational Psychology, 20:119-24, February, 1929. 





Dunlap, J. W. and Cureton, E. E. "On the Analysis of Causation", Journal of Educational Psychology, 21:675-75, 


December, 1950. 
- For illustration of Wright's method, see: 





Burks, B. S. "The Relative Influence of Nature and Nurture Upon Mental Development; a Comparative Study of Foster 
Parent—-Foster Child Resemblance and True Parent—True Child Resemblance", Twenty-Seventh Yearbook of the National 
Society for the Study of Education, Part I. Bloomington, Illimois: Public School Publishing Company, 1928, 





pp. 299-301. 





Heilman, J. D. "The Relative Influence Upon Educational Achievement of Some Hereditary and Environmental Fectors", 
The Twenty-Seventh Yearbook of the National Society for the Study of Education, Part II. Bloomington, Illinois: 





Public School Publishing Company, 1928, pp. 35-65. For « more extended account, see: 
Heilman, J. D. "Factors Determining Achievement and Grade Location", Journal of Genetic Psychology, 36:455-56, 


September, 1929. 
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indirect connection, not including those 
through the dependent variable.’ 

In the case of three independent vari- 
ables, this theorem provides the basis for 
writing the following equations? 


PizPoa* PsiPos* Piz Ps2 Pos *PisPasPoe 


+ 
* DieaPoit Psa Pos* Psa Pas Pio*® Pas Pai Pos 
Tos = Pos * Pas Poi? Pas Poa* Pas Pai Po2* Pes Piz Por 
+ 
+ 
+ 


Psa Pis 
Pas Pas 
Py Pas 
Although not attempted by Wright, Heil- 
man, or Burks, it is apparent that these 
equations may be simplified. Collecting 
terms in the first three equations, we have 
To. * Pos (pia * DisPas) Poal 
+ (pis + PrsPas) Pos! 
Toa - Poa ° (Di. > PesP,3) Pos J 
+ ((Pes ° P raP a) Pos! 
Tos = Pos * ((Pis * PasPiz) Po! 


+ [(Das * PrsPis) Poa! 


Tia = Pie * PasPis 
Tis = Pis * DasPie 
Tes. = Pas * Pais Pais 


Hence the first three equations may be 
written as follows: 


To. = Por * TaaPoe * TisPos 
Tor = Pore * TizPor * TasPos 


9 Pos ° TisPor ° Tas Poe 
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Examination of these equations will indi- 
cate how the corresponding equations for 
any number of variables my be written. 

The above equations are the "normal 
equations" from which regression coeffi- 
cients may be calculated.5 The schema for- 
mulated by Griffin‘ afford an economical 
method. The coefficient of multiple corre- 
lation affords a means of checking the cal- 
culations involved in securing the values 
of the coefficients of determination. The 
square of the coefficient of multiple cor- 
relation R o.as...n 18 equal to the sum of 
the coefficients of determination, exclu- 
sive of do,. In other words, Ro.s...n 
=1l1-d,,. Since Ro.i12..., is obtained 
through a different set of calculations, a 
check between and the sum of the 
coefficients of determination is a good in- 
dication that the calculations are without 
error. 

The application of the method defined 
by the coefficients of determination may be 
illustrated by taking variables constructed 
from counts of coin tosses as follows: 


Xo =A, +A, + Ay tA, + Ey 


X, =A, + A, + E, 





X, = A, + A; + EA 
X; = A, + A, + Es 


The structure of these variables is prob- 
ably not greatly different from that which 
we might have for a situation in which the 
contributions of certain factors to 
achievement in a subject such as chemistry 
is being studied. In this set-up, X, may 
be thought of as the dependent variable 
representing scores on an achievement test 
in chemistry, and X,, X,, and X, may be 
thought of as measures of abilities which 
contribute to achievement in chemistry. The 
component A, may be thought of as a 





anwere |; 


tistical Association, 18:995-1005, December, 1925. 
Barrett, H. E. 
of Educational Psychology, 19:45-49, January, 1928. 
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. For proof of this theorem, see the references to Wright's work. 

. The reader should note that the subscripts of the path coefficients have been simplified. 

. Tolley, H. R. and Ezekiel, M. J. B. "A Method of Handling Multiple Correlation Problems", Journal of American Sta- 
"A Modification of Tolley and Ezekiel's Method of Handling Multiple Correlation Problems", Journal 


. Griffin, H. D. "Simplified Schemas for Multiple Linear Correlation", Journal of Experimental Education, 1:239-S, 
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"ceneral factor"; Az, As, and A, may be 
thought of as factors unique to each abil- 
ity; Eo, E:, Ea, and E,; may be thought of 
as representing variable errors of measure- 
ment and validity. 

Applying the path coefficient technique 
to this problem, the following values were 
obtained for the coefficients of determina- 
tion. 





do, = 21901 
doz = 1116 
dos = .0400 
dow =. 1044 
dors = .0682 
dozs = .0536 
5679 


Subtracting the sum of the coefficients of 
determination from 1.00, we have d,.= .4321. 
Ry the above definition of X,, the unmeas- 
ured cause is represented by E,. Direct 
calculation gives the value, dog,= .1000. 
This means that the obtained values of the 
coefficients of determination are too small. 
Instead of their sum being only .5679, it 
should be .9000. The attenuating effect is 
jue to the fact that the use of the regres- 
sion equation, when the independent vari- 
ables are not components, results in only 
an approximate expression of the existing 
relationship. The calculated values of the 
coefficients of determination may indicate 
the relative order of magnitude of the con- 
tributions, but the fact that they are too 
small is a rather serious limitation of the 
technique. 

Even if the coefficients of determina- 
tion were not attenuated, there would still 
remain a serious difficulty of interpreta- 
tion. For example, if X, and X, represent 
scores on an intelligence test and a silent 
reading test, the total coefficient of de- 
termination'for X, camot be interpreted as 
a measure of the contribution from general 
intelligence because obviously the obtained 
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measures of this trait and those of silent 
reading ability have much in common. Coef- 
ficients of determination for the factor 
common to X, and X,, the component of X, 
uncorrelated with X,, and the component of 
Xz uncorrelated with X, would be more mean- 


ingful statistics. These factors may be 
described as defining causes that are ele- 
mental* with respect to the two given vari-‘* 
ables. 


FACTOR ANALYSIS 


If several correlated variables are 
being considered torether, it is generally 
agreed that four types of factors can be 
hypothesized, i.e., general factors, group 
factors, specific factors, and chance fac- 
tors. A general factor is one that is 
present in all the variables which are be- 
ing considered together and a group factor 
is one which is present in two or more va- 
riables. Specific factors and chance fac- 
tors are unique to each particular variable. 
Let 4,, 4g, 4, represent uncorrelated re- 
mote causes of X,, S, a specific factor, 
and e, a chance factor. Assuming that X, 
is a linear function of its causes, it may 
be expressed, 


Xo = CoiG, * Cog Gg F CosgGy F CogSq * Cog Oy 


If the ats account for the correlations of 
X, with the independent variables x,, X,, 
Xs, X,, and x,, they may be described as 
the elemental (remote) causes of the inde- 
pendent variables as well as of the de- 
pendent variable and the contributions of 
the independent variables to the dependent 
variable may be thought of as being made 
through these remote causes. In such a 
case each independent variable would be ex- 
pressed as a linear function of one or more 
of the remote causes, a specific factor, 
and a chance factor. 











1. The total coefficient of determination for X, will consist of the coefficient of direct determination for this va- 


riable plus a portion of the coefficient of joint determination. 


In the absence of a better method, a coefficient 


of joint determination has been divided in proportion to the coefficients of direct determination of the two vari- 


ables. 


- In explaining the general problem for which he developed the path coefficient technique, Wright introduced such a 


group of causes which he designed as "remote", but he does not give any technique for identifying and measuring 


their contributions. 
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Cy38, * Cy2Geg * Cass * Cag Si * Cys 8, oa, = Ca, = oa, = 08 = 08, = % 


If the c's are chosen so that & = 1, the 
variance ratios reduce to squares of the 
c's and we have, 


Co,4, * Cag4q + Cogs * CaqS2 + Cas C2 


Cy, 4, * Cyg 4g * Cyy4y * Cug8y F CyQ Oy 
Cai, * Cag Bg F Cag Gs F CagSy * Cys Oe 1 = Ch, + Coe * Cop * CO, * Cog 

The problem of determining the contri- 
butions from the remote or elemental causes 
If a factor is not a component of a is one of determining the c's in the expres- 
particular independent variable, the term sion for x, which are called factor load- 
corresponding to it in the above equations ings. As a basis for this determination, 
would be equal to zero. For example, if a, we may write equations similar to the above 
were not present in x, and x,, the terms for each of the independent variables. The 
C,,8, and c,,a, would drop out of the equa- | correlation between each of the variables 
tions written for x, and x,. In sucha in the above set of equations may be ex- 
situation we would say that a, and a; were pressed in terms of c's. For example, the 
reneral factors, being present in all the correlation or communality between x, and 
variables, and a, would be called a group X, may be written, 
factor, being present in only three of the 
variables, 

The contribution of each of the remote 
causes to x, may be expressed in terms of a 
variance ratio. For example, the contribu- 
tion of a, to x, is expressed by the vari- 


ance ratio 


c 


5,43 + Cs2 42 ” C55 Ay + C5485 + Cs5 Os 





Tor = Co, Cy, * Cog C2 *F Cos C3 


The coefficient of reliability is equal to 
1.00 minus the square of the factor loading 
| of the corresponding chance factor. For 


| example, 








ea 2 

Coa Sa, 

ae . 2 2 2 2 2 
9% Too = 1-00 = Cos = Cor * Cog * Cos * Coy 

If the a's, s, and e, are expressed in terms 


of standard units, we may write 


When the number of equations thus 





l. The proof is as follows: 
The coefficient of correlation ro, muy be expressed, 


s Tox. 
For * N00, 
Since X» and x, are expressed in terms of standard units, 0g = 0, = 1, and we can write, 
_ EXoxX; 
= 


Substituting the values for x» and x, we have 
ae Z(Cor8, + Co2Bz + Cos®s + Cog8o + Cos@o) 
oa = 
vw 


(,38, + CygMg + CygM3 + Cy45, + Cy 5e,) 


VW 
Multiplying these two terms, all the resulting products involving uncorrelated components are equal to zero, hence, 
Ya; La, Las 


Toa ~ Corti “5 + CozCi2 B + Co3Ci3 N 








2 De? zr 2 
The teras =, —2, and = are, of course, nothing more than the respective standard deviations a,» Cao é.- But 


N 
oa, = Cn, = Oa, = 1, since 4,, &g, &3 are expressed in terms of standard units. 


Hence, 


Toa = Soafaa + Co2Ci2 + Cos%is 
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formed is equal to the number of c's, it is 
theoretically possible to determine their 
values. However, the labor of solving a 
large number of simultaneous quadratic 
equations prohibits a direct attack as a 
feasible procedure. Kelley* has proposed a 
method of successive approximations based 
upon least squares, but Holzinger® has 
shown that several different solutions may 
be fitted to Kelley's data. Recently 
Thurstone* has developed a technique which 
rives a unique solution in the case of cer- 
tain factor patterns. 

Factor analysis appears to offer a 
means of determining the contributions to a 
riven dependent variable from the causes 
that are elemental with reference to the 
croup of independent variables. It is not 
necessary to demonstrate that the given in- 
dependent variables are causally related to 
the variable designated as dependent. The 
a's are causes and when all measures are 
expressed in terms of standard units, the 
squares of the factor loadings of the de- 
pendent variable will measure the contribu- 
tions of the elemental components. It 
should be noted, however, that the con- 
tributions measured are those of the ele- 
mental causes to the variance of the de- 
pendent variable and not to measures of the 
Jependent variable. In other words factor 


enalysis will not yield measures of the 
contributions to measures of achievement 
or of other traits. 
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The principal conclusions of this ar- 
ticle may be stated formally as follows: 

1. If a variable is known to be a com 
ponent of another variable, the contribu- 
tion of the first or of a common cause per- 
fectly correlated with it to the variance 
of the second is measured by the square of 
the coefficient of correlation between the 
two variables. 

2. The contributions of a third corre- 
lated variable may be partialed out when it 
is a component of the other two. When it 
is not a component, application of partial 
correlation yields only an estimate of the 
net correlation between the two variables. 
This estimate tends to be numerically larg- 
er than the true net correlation. 

3. The measures of the contributions 
of independent variables obtained from the 
beta coefficients of a multiple regression 
equation or by means of Wright's path co- 
efficient technique are attenuated esti- 
mates. Hence, these methods are not satis- 
factory for studying the contributions of 
independent variables to a dependent vari- 
able. 

4. Factor analysis appears to afford 
a means for securing measures of the con- 
tributions from elemental causes, but the 
method proposed by Kelley and that first 
proposed by Thurstone are not adequate. 
Thurstone's method for a unique solution 
appears to be satisfactory, but its appli- 
cation is dependent upon the existence of 
a certain factor pattern. 





1. Kelley, T. L. Crossroads in the Mind of Man. Stanford University, California: Stanford University Press, 1928, 





p. 122f. 


Another reference dealing with the same problem is Hotelling, Harold. 


"Analysis of a Complex of Stetistical Vari- 


ables into Principal Components", Journal of Educational Psychology, 24:417-41, 498-520, September, October, 1935. 





2. Holzinger, Karl J. and Swineford, Frances. 
23:247-58, March, 1952. 


5. No published account of Thurstone's work is yet available. 


tion of the procedure in Thurstone, L. L. 


"Uniqueness of Factor Patterns", Journal of Educational Psychology, 





The interested reader will find a description of a por- 





Inc., Jume, 1952. 65 pp. 
Thurstone, L. L. 


The Theory of Multiple Factors. Ann Arbor, Michigan: Edwards Brothers, 


A Simplified Multiple Factor Method and an Outline of the Computations. Ann Arbor, Michigan: 





Edwards Brothers, 1955. 26 pp. 
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THE CORRELATION COEFFICIENT 


Paul Han 

a 

Joseph 

The Catholic Univ 


During the last few decades the 
technique of correlation has been in- 
creasingly used in the social sciences 
to measure what is called the relation be- 
tween variables. Thus it is usual 
to say that if two variables yield a 
correlation coefficient of +0.95 they 
are “closely related." If the coeffi- 
cient is near zero, we say they are “un- 
related." We use such language with 
more confidence if the coefficients in 
question are product-moment coefficients 
or ones considered equivalent to such 
coefficients. But the concept of meas- 
uring relationship is quite general 
and is used to some extent in regard to a 
variety of different techniques. 

In view of the wide use of correlation 
and the considerable amount of mathematical 
discussion which it has evoked, it seems 
strange that there has been very little dis- 
cussion of the meaning of the words rela- 
tion and closeness of relationship as used 
in this connection. The present article is 
offered as a contribution to this problem. 
In what sense can we say that correlation 
measures relationship? 

It is hardly necessary to remark that 
correlation does not measure causal relation 
ship. For example, if it is shown that 
there is a high correlation between annual 
marriage rates and some index of general 
business condition, then the said correla- 
tion does not prove that fluctuations in 
business cause changes in marriage rates, 

nor that marriage rates change the condi- 
tion of business, nor that both variables 
depend on some third factor. These ques- 
tions are to be decided from considerations 
quite apart from the mathematics of correla- 
tion. Mathematics cannot measure causality. 
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AS AN INDEX OF RELATIONSHIP 


by 
ly Furfey 
nd 
F. Daly 


ersity of America 


At most it can tell us something about con- 
comitant variation. Since the term rela- 

| tionship may suggest the existence of caus- 
al relationship, it would probably be bet- 
ter to discard it altogether and use the 
less ambiguous term concomitant variation. 
Ye shall not, however, endeavor to intro- 
duce this somewhat awkward term but shall 
continue in this article to use the terms 
relation and relationship, bearing in mind 
the restrictions discussed in this para- 
graph. 

Correlational analysis, then, is con- 
cerned with concomitant variation. Before 
pursuing this subject further, it is worth 
/ noting that the use of correlation implies 
a type of problem quite strikingly different 
from that usual in the physical sciences. In 
the latter, relationship is simply consid- 
ered to be present or absent. Generally 
speaking, the physical scientist refuses to 
recognize intermediate degrees of relation- 
ship. If the points representing the paired 
values of the two variables fall along some 
reasonably simple mathematical curve, then 
he considers that the variables are related. 
If the points are so scattered that such a 
curve cannot be drawn, he ordinarily sus- 
pends judgment. 

Not so in the social sciences! Here 
the extremely simple type of relationship 
which may be expressed by a mathematical 
curve occurs but rarely. The social scien- 
tist therefore introduces the new concept 
of closeness of relationship, that is, the 














degree to which the bivariate distribution 
in question approximates the perfect type 

of relationship expressible by some rela- 
tively simple mathematical function. To 
characterize this degree of approximation in 
a quantitative manner is tue function of 
correlation. 
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It cannot be too strongly emphasized 
that correlation is an arbitrary process. 
We are not forced by the nature of things 
to adept any particular definition for the 
concept closeness of relationship. We are 
guite free to choose our own definition. As 
q matter of fact a great many different 
‘efinitions have been suggested, since each 
of the many suggested ways of measuring re- 
lationship implies its own definition. We 
cannot say that one of these is right and 
the others wrong. We can merely say that 
one is more convenient to calculate than 
the others, or more useful in solving cer- 
tain problems, or easier to interpret. 

We have said above that correlation es- 
sentially is an effort to measure the de- 
~ree to which a given bivariate distribution 
departs from the perfect type of relation- 
ship expressible by some simple mathemati- 
cal curve. It is evident, therefore, that 
the correlation technique must involve two 
choices: (1) the choice of some mathemat- 





ical function as a standard of perfect re- 
lationship, and (2) the choice of some nu- 
merical measure of the degree to which the 
civen bivariate distribution departs from 
this standard. 

We might discuss the significance of 


the above two choices in regard to any of 
the numerous devices for measuring correla- 
tion which have been proposed at various 
times. For the sake of brevity, however, 
we shall confine ourselves to the discus- 
sion of the product-moment correlation co- 
efficient (r) and the correlation ratio 
(eta). Since these two measures of rela- 
tionship are usually considered the best 
available, any criticisms we make of them 
may be expected to apply with still more 
force to the other and admittedly inferior 
measures. 

The standard of perfect relationship 
in the case of eta requires that all the 
points in the scatter diagram shall fall on 
a curve expressible as a mathematical func- 
tion which is single-valued in respect to 
both variables. In the case of r all _ the 
points must fall on a straight line if the 
relationship is to be considered perfect. 
As far as their standard of perfect rela- 
tionship is concerned, r my evidently be 
considered a special case of eta. For this 
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reason we shall refer to the two methods 
collectively as the r-eta technique. 

The standard of perfect relationship, 
as defined in the r-eta technique, is not 
entirely arbitrary. At least it is "natu- 
ral" in the sense that relationship thus 
defined as perfect is equivalent to rela- 
tionship as it is defined in the physical 
sciences by means of mathematical equations. 
But the second element in the r-eta tech- 
nique, namely, the measurement of the de- 
gree of departure from this perfect rela- 
tionship, is arbitrary to a considerably 
greater extent. The words, "closeness of 
relationship" have no meaning in the Eng- 
lish language which is so definite that 
that meaning can be converted into mathe- 
matical terms in the sense that, for exam- 
ple, the word, "velocity" can be so con- 
verted. Closeness of relationship is itself 
defined by the equations which define r and 
eta and we cannot quarrel with that defini- 
tion. Just so every other measure of cor- 
relation, say Spearman's foot rule, defines 





| some sort of "closeness of relationship" 


and we cannot say that one definition is 
better than another, except in the sense 
that one definition may be more useful. 
only valid test of a definition is the 
pragmatic test. 

We can make these definitions more 
concrete in our own minds by interpreting 
them in various ways. Thus we may, if we 
wish, interpret the closeness of the rela- 
tionship between x and y,whenr is less than 
unity in absolute value, by looking upon x 
as the sum of two components of which one 
is perfectly correlated with y, while the 
other has zero correlation with y. Or we 
may interpret r as the slope of a regres- 
sion line. Or we may interpret either r or 
eta as a function of the amount of scatter 
around the regression lines or regression 
curves. These interpretations do not make 
the definition of closeness of relationship 
any more "natural" but they may make the 
definition more useful and thus constitute 
an argument in favor of the use of the r- 
eta technique. 

To be useful a definition must be un- 
ambiguous. We shall now proceed to criti- 
cize the r-eta definition of "closeness of 
relationship" on the ground that it yields 
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different numerical values for the same bi- 
variate distribution. If we can establish 
this fact, then the definition is ambiguous; 
it is imperfectly useful; and it may proper- 
ly be considered a poor definition. 

The r-eta technique measures relation- 
ship by measuring departure from a standard 
of perfect relationship. This standard is 
a mathematical curve. If the relationship 
is imperfect the r-eta technique provides 
no unambiguous way for discovering this 
curve. We are left to choose it ourself. 
The value of the numerical measure of close- 
ness of relationship depends on this choice. 
Therefore the r-eta definition of closeness 
of relationship is ambiguous. 

Let us try to make this concrete. We 
shall discuss, therefore, the difficulties 
of measuring the relation of y tox by 
means of the regression curve of y on x. Of 
; course in any practical case there is the 

added question of the regression of x on y. 


tt Let the scattergram be divided into colums 


a 


—E———————— 


by lines parallel to the axis of y. Let 
the means of these columns be computed and 
let these meens be connected by some func- 
tion y = f(x) which passes through all of 
them. Let us suppose that this function is 
a straight line. The regression is said to 
be linear and r is considered the appropri- 
ate measure of relationship to be used. 

This is an ideal case which seldom or 
ever occurs in actual practice. But even 
this ideal case is subject to a certain am- 
biguity. For we are interested usually, not 
in measuring closeness of relationship in 
the particular sample at hand, but rather 
in measuring it in the bivariate universe 
from which this sample was drawn. Even if 
we decide that it is appropriate to treat 
this sample as linear, we have no assurance 
that the corresponding regression in the bi- 
variate universe is also linear. 

The force of this objection is clearer 
when we consider the vastly more common 
case when the function which passes through 
the means of the columns is some non-linear 
function, say, y = F(x). Here there arises 
the question whether to use this particular 
function as a basis for calculating eta, or 
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to divide the scattergram into another set 
of columns and obtain probably a different 
value for eta or to use r as a matter of 
convenience as though the regression were 


linear. 
One might adopt as a criterion the 


which most closely approximates the corre- 
sponding regression function in the bivari- 
ate universe from which this sample was 
Most discussions are based on this 
principle; but the hopelessness of the sit- 
uation is evident. It involves arguing from 
the shape of this sample to the shape of 
the universe from which it was drawn and 
thence back to this sample--a vicious cir- 
cle. For example, we may calculate Zeta, 
and compare it with its o. This procedure 
gives us the probability that a sample 
drawn at random from a linear universe 
should depart from linearity as widely as 
this sample. But this is not the same 
thing as the probability that this sample 
was drawn from a linear universe. The evi- 
dent distinction between these two probabil- 
ities is seldom recognized by statisticians, 
Whenever we endeavor to measure close- 
ness of relationship between two variables 
in a statistical sample we are faced with 
the two questions: Shall we treat this 
sample as linear or non-linear? If it is 
non-linear, which of the various possible 
regression functions shall we choose? Ac- 
cording as we choose different answers to 
these questions we shall obtain different 
numerical measures of the existing close- 
ness of relationship. There is no valid 
criterion to help us decide which of these 
values should be considered the correct one. 
Therefore closeness of relationship as de- 
fined by the r-eta technique is essential- 
ly ambiguous; and the r-eta definition of 
"closeness of relationship" is a poor one. 
In actual practice most statisticians 
cut the Gordian knot by treating all re- 
gressions as though they were linear. This 
practice is, of course, condemmed by a11l 
writers on statistics. It has, however, 
the advantage of providing an unambiguous 
definition for closeness of relationship. 
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aut the disadvantages involved more than 
counterbalance this advantage. For the in- 
jiscriminate use of r can even lead to the 
finding of zero relationship in distribu- 
tions where the relationship is expressible 
by a simple mathematical function and is 
+nerefore perfect in the sense of the phys- 
icist. 

More careful statisticians adopt some 
compromise. For example, they may decide 
to use some such test of linearity as the 
zeta test and to choose eta whenever the 
ratio between zeta and its o exceeds a cer- | 
tain value, and r otherwise. Of course this | 
joes not remove the ambiguity as far as eta | 
is concerned, but it provides through r an 
unambiguous definition for closeness of re- 
lationship in a considerable number of 
cases. 

The procedure mentioned in the last 
paragraph, however, is subject to certain 
considerable disadvantages. The definite- 





ness of definition is obtained by arbitra- 
rily lumping together various kinds of bi- 
variate distributions which might profit- 
ably have been defined as representing dif- 
ferent degrees of closeness of relationship. 
Just so, the entomologist might obtain un- 


ambiguity of definition by lumping all but- 
terflies together as one species. Again, 
this wide use of r where the regression is 
not strictly linear destroys such useful 
interpretations of closeness of relation- 
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ship as that which interprets closeness in 
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terms of the scatter around the regression 
line. In other words, although this usage 
cannot be criticized as ambiguous in the 
sense of yielding different values for the 
same distribution, it can be called ambigu- 
ous in that it yields the same value for 
distributions which might usefully be con- 
sidered as representing different degrees 


| of relationship. 


It will be seen from these considera- 


| tions that the r-eta technique is subject 


to disadvantages of a serious character. 
Each bivariate distribution can be made to 
In general 
these various numerical measures of close- 
ness of relationship will not be equal. We 


| are therefore faced with the necessity of 


choosing between them. If we make our 
choice on the principle that the regression 
chosen should conform to the existing bi- 
variate distribution in some reasonable way, 
then we must make our choice between r and 
the various etas on a non-mathematical basis 
and the measure of closeness of relation- 
ship ceases to be truly quantitative. If, 
on the other hand, we arbitrarily choose to 
use r, even when the distribution is not 
surely linear, then we secure a definite 
numerical value for our coefficient, but 
sacrifice a large part of its meaning as a 
measure of relationship. 
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TABLES FOR FINDING THE PARTIAL COEFFICIENT OF CORRELATION 
by 
William Dowell Baten 
University of Michigan 


The object of this article is to pre- | 
sent tables for finding the partial coeffi- | 
cient of correlation, | 


(1) Tias = | ——————— 
V (l-ry; )(1-r.5 ) 


| 
between x, and x, with x, held constant. | 
The quantities r,, and r,, may be inter- | 
changed in the formula without affecting 

the value of the the partial coefficient. 
For a mathematical treatment of partial cor-| 
relation see Camp! and Rietz.* 

The following tables contain values of 
Ti2.3 for values of the total correlation co-| 
efficients for every one-tenth, that is r,, | 
has values 0,.1,.2,...,.9,1.0 while r,, and 
Tas have values equal to .0,.1,.2,...,.9 re- | 
spectively. The following example will il- | 
lustrate the meaning of the formula and al- 
so show how to use the tables. 

| 


The correlation coefficient between the 
weights and chest measurements of men who 
are twenty years of age is r,,= .8; the cor= | 
relation between weights and heights is | 
r,,;= .5, while the correlation coefficient | 
between chest measurements and heights is 
re, = .6. The partial coefficient of corre- 
lation between weights and chest measure- | 
ments with height held constant is accord- 
‘ng to Table I 


Fis.g © ofbe 


To find this, find .5 in the colum 
headed r,, and then locate in the colum 
headed rz; the number .6 between .5 and .6 
for r,;; now go down the column r,,= .8. 
This is the sixth column in 





Table I. Go down this column to the row 


Tis» Tas= -5, 6. This gives r,,., = .72. 
Since r,, and r,, can be interchanged 


_in formula (1), this same value for r,,., is 





obtained when r,,= .6, and r.,;= .5. The 


| value for r,,., is found in the same place 
| as before. 


Suppose we know that r,, = .4, r,, = .2 
and r,,= .7, and wish to findr,, ,. Go to 
column for r,,= .4 and go down this colum 
to row r,,, Tes = -2, -7. This gives .42 


| SOF Pos. 5 © 


If r,, andr,, have the same sign the 
partial coefficient of correlation can be 


| also found from Table I, provided r,, is 


positive. Suppose r,, = .5, r,,= -.3 and 
T2,= -.9. Pay no heed to the siens of r,, 
and r.,- Go down the colum for r,,= .5 to 


row Tis» Tas = 05, -9; this gives .52 for 


Pigs 

Suppose r,, = .7, Ty; = .8 and r,, = .2. 
Interchange the values of r,, and r,, and 
do as before. This gives for the partial 
coefficient the value .92. 

Table I can be employed when r,, and 
ra, are unlike in sign provided r,, is 
negative. In this case the sign of r,,.; is 
opposite to that found in the table. Sup- 


| pose r,,= -.8, r,,= +.2, and r,,= -.3. Pay 


no heed to the signs. Go down the colum 
for r,,= +.8 to the rowr,,, T,,= «2, .3; 
this gives +.79 for r,,., . This value must 
be changed to -.79. 

Consider the case when r,, = -.3, r,, 
= -,.6, and r,,= .9. Go down the column for 
ri, = +.3 to the row for r,,, T,,= +.6, +.9; 
this gives -.69 for r,,., . By changing the 
Sign the real value of r,,., is +.69. 

When r,, and r,, are unlike in sign 
and r,, is positive then Table II must be 
used for finding values forr,,.,. Assume 
that r,,= .3, T,,* -4 andr,,= -.6. In 


=) gaa 








1. Camp, B. H. "The Mathematical Part of Elementary Statistics", pp. 341-542. 
2. Rietz, H. L. "Mathematical Statistics", pp. 96-101. 
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Table II go down the column for r,,= .3 to 
the row for r,,, T,,= *+.4, *+.6; this cives 
.74 for Ty2., - This is as it should be for 
when r,, and r,, are unlike in sign and r,, 
{s positive the numerator in (1) is posi- 
tive, while the denominator is always posi- 
tive. 

If ry, and r,, have the same sign while 
r., is negative the value of r,,., is nega- 
tive and a negative sign must be placed be- 
fore the value found in Table II. For exam- 
ple ry, = --6, ©,,= .3 and r,,= .5. Go down 
the column for r,,= +.6 to the row forr,,, 
r.,= -3, »5; this gives +.91. But since we 
know the value for r,,., is negative a 
minus sign must be placed before this value. 
Thus the correct value is r,,, = -.91. 

These tables can be used to find the 
partial correlation coefficient, 


r. 


12-3 


~ Tig.s *TM.s 


om ve 1-r.5. ) (1-r,,,5° ) 





r : 





between xX, and x, when x, and x, are held 
constant. 

If Tiss Tiss Tas » Tass Tags andr, 
are known, then r,,.,, can be found from Ta- 
bles I and II. First it is necessary to find 
Tie.s » Tag.a» And r.,.,. After these coef- 
ficients of the second order have been 
found they can be used as the total corre- 
lation coefficients were used before. 

Consider the example: 


Tia = Treight height = *5- 

Ti2 = Treight rt. thigh = «8. 
Tee = Tt. thigh height * ~**- 
Ta = Treight chest meas.~ °° 
Tas = TM. thigh chest meas. “4. 
Ts. = Tohest meas. height =.6. 
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Find the value of 


Tia.se * Tweight rt. thigh - chest meas. height. 


From the tables r,,., = .79 = .8, Ty., = -14 
= .1, Tog.3 = *-05+ = +.1. Now use r,,., 48 
Tiss Tig.s 28 T5 and r,,., aS r,, and look 
in the tables. This gives .9 for r,,.,,- 

It must be remembered that when the 
total coefficients of correlation are ex- 
act numbers the tables give results correct 
to two significant figures but when the to- 
tal correlation coefficients are exact for 
one significant figure the tables are cor- 
rect for only one figure. Other tables are 
being prepared whereby results can be read 
to two significant figures, in the case un- 
der consideration to two decimal places. 
This longer table will of course give more 
accurate results yet the tables presented 
here can be used for rough work and will 
give a very good idea concerning the size 
of the various correlation coefficients. 

Partial coefficients of correlation of 
higher order may also be obtained from 
these tables. For example, the partial 
coefficient, 


Ta.34...(n-1)—"an-3... (m-1) 7 on-se. .-(n-)) 





Tie.s6 i= — — =a 
eee 2 2” 
Vo - Tin.ae...(m-1) I[1 - Ten.ae...(n-1) ] 





may be obtained from these tables by 
building up the various partial coeffi- 
cients of correlation of lower orders. 
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THE PARTIAL COEFFICIENT OF COPRELATION WHEN 
IS POSITIVE AND ry3 AND rey ARE ALIKE IN SIGN, (b) ry, is 
NEGATIVE AND rj, AND rz, ARE UNLIKE IN SIGN 


The value of rys,., when r,, is equal to 


-69 
71 
83 


- 95+ 


92 


-67 
-66 
-67 
-68 
72 
-80 


-64 
63 
-63 
-64 
-69 


- 85+ 


-60 
-58 
57 
58 


66 


53 
-49 
-46 
46 


+41 
«35 
222 


17 


-.08 
-.58 
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1.00 


1.00 


1.00 


1.00 


1.00 


1.00 


1.00 


1.00 


1.00 





“The value of rig,3 when r,, is equal to 
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TABLE II 
TABLE FOR FINDING THE PARTIAL COEFFICIENT OF CORRELATION WHEN 


(a) m2 IS POSITIVE AND r,5 AND res ARE UNLIKE IN SIGN; (b) rye IS 
NEGATIVE AND r,5 AND m3 ARE ALIKE IN SIGN 


The value of ry3.3 when rye is equal to 





-4 5 -6 7 8 9 


3 
» 
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-50 -70 
-50 -70 
51 71 
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The value of ry9,3 when ry, is equal to 





A minus placed after a five, for example .65-, means that this .65 was not quite .65. A plus after a five means 
that the rounded off number was greater than five. A five with a dot above it, 5, means that it is exactly five. 
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AN EMPIRICAL TEST OF SAMPLING 


by 


Donovan A, 


Johnson 


Stillwater High School, Stillwater, Minn. 


and 


Alvin C. Eurich 
University of Minnesota 


The theory of sampling has been ap- 
plied widely to psychological and education 
al data. Probable and standard errors have 
been used in these fields until the highly 
trained mathematician and the novice alike 
regard the results with a great deal of 
skepticism. And well they might, for too 
often formulae have been applied when none 
of the assumptions underlying them are ful- 
filled or when it is not known that any of 
the assumptions are satisfied. Much can be 
gained, it seems, through empirical tests 
of the theory of sampling with a wide vari- 
ety of data. This is particularly true 
where it is desirable to observe annual 
trends and the summarization of data for 
the entire population is too laborious and 
expensive. The present study deals with a 
situation of this type. 


THE PROBLEM 


Each year the Minnesota State Depart- 
ment of Education collects considerable in- 
formation concerning all teachers and school 
administrators in the public schools 
throughout the state. Not only the State 
Department but educational agencies as well 
are concerned about the trends in the qual- 
ifications of this group. It is a very 
practical question, therefore, to ask how 
large a proportion of the total group is 
necessary in order to obtain reliable re- 
sults for the data collected annually. To 
answer this question all the data collected 
in September 1931 from 3,437 high school 
teachers, principals and superintendents 
were analyzed by ten percent samples and 
by various combinations of these samples. 





The reports required of all school 
systems in Minnesota contained information 
in regard to the class of school, the nunm- 
ber of periods in a school day, the length 
of each period in minutes, and the names of 
the superintendent, principal, and teachers. 
For each person on the staff, the following 
information was supplied: the kind of cer- 
tificate held, date of expiration of the 
certificate, the major and minor fields as 
given on the certificate, name of school 
from which he was graduated, date of grad- 
uation, course taken, years of experience, 
subjects taught listed by periods and by 
grades, and annual salary. In general the 
types of statistical constants checked for 
reliability were percentages, medians, and 
quartile deviations. This paper mst be 
limited to a few representative samples of 
the wide variety of analyses that were 
made. However, the total picture was ex- 
ceedingly consistent in showing that reli- 
able results can be obtained for practical- 
ly all data when less than a third of the 
reports are used. 



































METHODS OF ANALYSIS ‘ 


To facilitate the analysis, the re- i 
sponses on each report were converted into i 
a code and punched on a Hollerith card. 
These were readily sorted and tabulated 
mechanically. 

The samples used throughout the study 
were selected in the following manner. 
First, the cards were alphabetized by towns 
or names of the schools and by names of the 
teachers, principals, and superintendents 
within the schools. These cards were then 
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divided into ten samples of approximately 
10 percent each. The first lot was se- 
lected by taking every tenth card from the 
files after they had been arranged as ex- 
plained above. This lot then contained the 
tenth card, twentieth card, thirtieth card 
and so on through the entire files and was 
designated sample number zero, which was 
punched on each card. Sample number one 
was selected by taking the first, eleventh, 
twenty-first card and so on through the 
files. Likewise, sample number two con- 
tained the second, twelfth, twenty-second 
card and so on. This method was followed 
throughout, dividing the group into ten 
samples selected at random according to one 
of the best mechanical methods. 

The percentages, medians, and quartile 
deviations were obtained for the various 
samples as well as for the entire data. The 
sample percentages, medians, or quartile 
deviations were considered reliable if they 
did not deviate by more than * 4 P E from 





the corresponding values for the total pop- | 


ulation, The probable errors were based 
upon the entire group because the sample 
probable error proved to be too large for 
practical application. There can be no 
question concerning the interpretation of 
the results when this method is used be- 
cause the probable error based upon the en- 
tire data provides a more rigid test of re- 
liability than would be obtained through 
the use of probable errors based upon the 
frequency within each sample. 

The assumptions underlying this check 
of the reliability of samples are those 


generally made when probable errors are ob- | 


tained. To understand fully the nature of 
this analysis, specific attention must be 
directed to the following two assumptions: 
(1) If the same percentage or other 
statistical constant is determined for an 
infinite number of random samples that are 
relatively large and of equal size, the 
values will distribute themselves in a nor- 
mal distribution with the mean equal to the 
true value obtained from the entire group. 
(2) The entire group used in this 
study is merely a sample of a still larger 
group in which the larger sample values al- 
so distribute themselves normally. This as- 
Sumption is reasonable since all Minnesota 
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teachers, principals and superintendents 
might be considered a sample of the teach- 
ers, principals and superintendents in the 
Northwest. It was necessary to make this 
assumption in order to obtain the probable 
errors of the constants derived from the 
entire Minnesota group. Regarded as true 
percentages or medians the probable errors 
would have no meaning. 

The validity of the first assumption 
was tested by distributions of frequencies 
derived from ten percent samples and by 
averages of the values obtained from ten 
percent samples. Since in terms of the 
original samples it was possible to secure 
only ten values, a situation had to be hy- 
pothicated with a larger number of samples 
of equal size. This was done by adding the 
frequencies of two ten percent samples and 
dividing the sum by two. In this manner 
all possible combinations of the ten sam- 
ples taken two at a time were added to- 
gether. To the results from the 45 pos- 
sible combinations were added those of the 
original ten samples, thus making a total 
of 55 samples. While the result derived 
from the 55 samples is probably not a true 
picture of an infinite number of samples, 
it tends to approach that situation. 


RESULTS 


The data on the number of graduates 
of the University of Minnesota in Class B 
four-year high schools will serve as a 
typical illustration of the distributions 
obtained. In Table I the number of Minne- 
sota graduates for each ten percent sample 
is given. Thus, in one ten percent sampl 
there were 16 graduates of the University 
of Minnesota; in another there were 20; in 
three, 26; etc. On the bottom line appears 
the distribution of the results for the 
hypothetical situation of 55 ten percent 
samples obtained by the method described 
above. The nature of these distributions 
is clearer in Fig. 1. The points located 
by the X's represent the data for the orig- 
inal ten percent samples and my there- 
fore be read as follows: for one sample 
the number of graduates from the University 
of Minnesota is 16; for another sample, 20; 
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and for three others, 26; etc. The continu- 
ous line represents the distributions for 
the hypothicated situation and the broken | 
line represents the smoothed frequency sur- 
face. The number of Minnesota graduates in 
the total population as estimated from the 
mean of the frequency distribution for the 
55 samples is 281; the actual number of 
graduates in the total group is 273. Since 
this degree of similarity appears repeated- 
ly throughout the results, the first assump- 
tion that the mean of the sample constants 
is equal to the true values is practically 
realized. 

To illustrate further the analysis of 
the reliability of percentages for samples 
varying in size, the data on the number of 


periods in the school day have been selected 


as representative. In regard to this item, 
Table II contains the following data: 


TABLE I 


DISTRIBUTION OF THE NUMBER OF GRADUATES OF THE UNIVERSITY 

OF MINNESOTA IN CLASS B, FOUR-YEAR HIGH SCHOOLS, FOR EACH 

TEN PERCENT SAMPLE AND FOR THE AVERAGES OF THE TEN PER- 
CENT SAMPLES 





Number of individuals 
16 18 20 22 24 26 28 30 32 34 36 Total 


10% Samples l 1 aS Se ee Re 10 
Averages Based 
on combinations 1 1 2 5 S111110 6 2 1 = S55 


of samples 
— — ———____——— 





TABLE II 


THE DEVIATIONS OF SAMPLE PERCENTAGES FROM THE PERCENTAGES | 


OF THE TOTAL GROUP OF TEACHERS, PRINCIPALS, AND SUPER- 
INTENDENTS IN CLASS B, FOUR-YEAR HIGH SCHOOLS HAVING 
VARIOUS NUMBERS OF CLASS PERIODS IN THE SCHOOL DAY 

Deviations of Sample Per- 
centages of Total 

Number of Percentage of Group 
Periods Total Group PE 10% 20% 30% 40% 50% 











4 8 6 -2% -2 -2 -l1 -l 
8 61 1.0 -3# -l -l1 -l -l 
7 4 -7? S* 2 10) 0 0 
6 17 8 OF O 1 1 1 
Total 1090 105 212 327 459 540 
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Fig. 1. The mumber of graduates of the University of 
Minnesota in Class B, four-year high schools 
as estimated by ten percent samples of the 
total group of teachers, principals, and 
superintendents. 


1. The percentage of the entire group 
within class B, four-year hich schools for 
each number of periods in the school day. 
(True percentace. ) 
2. The probable error of the percentace, 
3. The deviations of the sample per- 
centages from the true percentage. 
4, The minimum size of reliable sam- 
ples for each percentage as indicated by 
the star. 
5. The total number of individuals in 
each category. 
The second column of the table indi- 
cates that 8 per cent of the total group of 
teachers, principals and superintendents 
| were employed in school systems with nine 
class periods in the school day. The per- 
centages for both the 10 and 20 percent 
samples deviate from that for the total by 
|°2. For the larger samples the deviation 

is -1. The remaining portion of the table 

likewise reveals that for practical pur- 

poses a 30 percent sample is large enough 
| to yield reliable results for the particu- 
| lar type of information included. 

Fig. 2 represents graphically certain 
data concerning the proportion of teachers, 
principals and superintendents connected 
with schools having different numbers of 
periods in a school day. The shaded columns 
represent the proportions derived from 
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10 percent samples; the columns in outline, 
the 20 percent samples; and the black col- 
ums, the entire group. While again this 
craph is merely representative, it portrays 
the fact that for the data analyzed, the re- 
sults obtained from the 10 and 20 percent 
samples are strikingly the same as for the 
entire group. 

A summary of reliable sample percent- 
ages for data on length and number of peri- 
ods related to class of school appears in 
Table III. In terms of the criterion set 
up it may be seen that 42 of the 50 per- 
eentages are reliable when based on a 10 
percent sample, five additional require a 
20 percent sample, and only three of the 50 
require a sample as large as 30 or 40 per- 
sent to be reliable. 
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Fig. 2. The proportion of teachers, principals, and 
superintendents in schools with different 
numbers of class periods in the school day 
as determined by a ten percent, 4 twenty 
percent sample, and the entire date. 


TABLE III 


THE DISTRIBUTION OF RELIABLE SAMPLE PERCENTAGES 
FOR DATA ON LENGTH AND NUMBER OF 
PERIODS RELATED TO CLASS OR SCHOOL 











Class of 
School 10% 204 30% 40% Total 
0 8 1 9g 
2 7 2 9g 
3 6 1 7 
4 4 9 
5 7 1 8 
; c ans 5 1 2 8 
Total 42 5 1 2 50 
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In testing the reliability of sample 
percentages, 403 different items were anal- 


the separate groups varies from 0 to 1,090. 

The percentages range from 0 to 98. These 
| figures give some indication of the compre- 
“hensive treatment of the data by samples. 
Throughout, the results are consistent in 
showing that percentages based on a sample 
of 3O percent are reliable. Only i7 of the 
403 percentages based upon a 3 percent 
sample deviate more than four probable er- 
| rors from the corresponding percentages of 
| the entire group. If the results from any 
| 30 percent sample had been used and inter- 
| preted in terms of the probable srrors based 
upon these samples, they would not differ 
sipnificantly from the results derived from 
the total group. 

A check of the reliability of salary 

| medians was made by grouping teachers, 
principals and superintendents according to 
(1) place of graduation, (2) kind of degree, 
| (3) college course, (4) class of school, 
(5) kind of certificate, and (6) position 
and experience. The deviations of sample 
medians from the true median annual salary 
| for graduates of various groups of colleges 
| are given in Table IV. In Fig. 3, similar 
| data are shown for educators in various 
classes of schools. On this graph, the 
horizontal axis represents the size of sam- 
ple, and the vertical axis represents the 
deviation of the sample medians from the 
true median annual salary in dollars for 
each class of school. The horizontal zero 
line represents the true median. The limits 
of reliability (+ 4 P E£) are indicated on 
both sides of the diagram, the curve and 
its reliability limit having the same leg- 
end. A study of the graph shows in gener- 
al that the curves for different classes of 
schools approach the true medians as the 
size of the sample is increased. With a 
20 percent sample for class O schools, the 
curve barely comes within the limits of re- 
liability. With a 30 percent sample, the 
deviation from the true median is less than 
one probable error. With samples of 40 per- 
cent or larger the variation from the true 
median is very slight. Only for class 3 
schools are the samples of W percent or 
more unreliable. Since only 105 individuals 
are included in schools of this class, it is 


| yzed. The total number of individuals in 
| 
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not surprising that the results derived 

from samples are not more reliable. 
TABLE IV 


THE DEVIATIONS OF SAMPLE MEDIANS FROM THE TRUE MEDIAN 
ANNUAL SALARY FOR GRADUATES OF VARIOUS COLLEGES 


JOURNAL OF EXPERIMENTAL EDUCATION 





Deviations of 
Sample Medians 
108 208 SOt 408 SOs 
Sam- Sam- Sam- Sam- San- 


Place of Total Group 
PE ua. ple ple ple ple ple 


Graduation Number Median 


College in 


rein 1,496 1,582 7.68 -13® 5 lu 8 2 
Outside of 

Minn. 920 1,680 16.6 160 65 4s" 15 3 
U.of Minn. 691 1,420 11.6 -26% 7 -5 Oo -2 
No report 128 

Total 5,457 





*The smallest size of sample which gives a reliable re- 
sult in that category. 


73 | 
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The median annual salaries for teach- 
ers, principals and superintendents grouped 
according to class of school are shown in 
Fig. 4 for a 20 percent sample, a W per- 
cent sample, and for the entire data. Clear- 
ly, a 30 percent sample yields a picture not 
unlike that for the total group. For all 
Classes of schools except 3, the median 
based upon 30 percent deviates less than 
/ one probable error from the median based 
| upon the entire data. 

In Table V may be found a summary of 
| the minimum size of sample yielding reli- 
_ able median salaries for the various cate- 
_gories. A total of thirty-six distributions 
was analyzed. The frequencies within these 
groups range from 49 to 1,858. The median 
annual salaries range from $1,150 to 
$2,040. Only one of the 36 distributions 
required a sample larger than thirty per- 
cent. For 13 distributions a 10 percent 


sample was sufficient. 
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The deviations of sedian anmal salaries of teachers, principal, and superintendents in each class of school 
as determined by different samples from the median anmual salary for the entire group within each class of 
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Fig. 4. The median anmal salaries of teachers, princi- 
pals, and superintendents in different classes 
of schools as derived from a twenty percent 
sample, a thirty percent sample, and the entire 
group. 


TABLE V 


A FRECUENCY DISTRIBUTION OF THE SIZE OF RELIABLE 
SAMPLE IN THE DETERMINATION OF MEDIAN ANNUAL 
SALARIES FOR VARIOUS CLASSIFICATIONS 





Frequency of Reliable Medians 
10% 20% 50% 40% 


Classification Sample Sample Sample Sample 
Class of School 1 4 1 

Course in College 1 2 

Kind of Degree 2 1 
Place of Graduation 2 1 | 
Position and Experience 6 8 1 
Kind of Certificate 1 2 5 
Total 135 14 8 1 





The reliability of the semi-interquar- 
tile ranges was also determined for 27 
croups. Again a 3 percent sample proved 
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to be adequate. The variations from the 
| true value, however, were considerably 
| greater than for the percentages or medians. 


SUMMARY 


This study is a practical test of the 
theory of sampling. Specifically, informa- 
tion concerning the qualifications of high 
school teachers, principals and superin- 


| tendents in Minnesota was collected in re- 


gard to all such persons employed in the 
state--a total of 3,437. The total popula- 
tion was divided into ten random samples. 
Data were analyzed for each sample and for 
various combinations of samples. While it 
is difficult in this brief space to give a 
complete picture of the analysis, it may be 
said in conclusion that the results support 
the following generalizations: 

1. A W percent sample of high school 
teachers, principals and superintendents 


| within the State of Minnesota is sufficient- 


ly large to represent the entire group in 
dealing with data concerning teachers' 
qualifications. The use of larger samples 
does not increase the reliability suffi- 
ciently to warrant the time and effort re- 


| quired. 


2. The percentages, medians and quar- 
tile deviations based vpon 30 percent of the 
group deviate more than four probable errors 
from the result for the total group slightly 
more than five times out of 100. 

3. The method of sampling used in this 
study may be used in annual investigations 
of teachers' qualifications in the State of 
Minnesota. 

While these generalizations are sound 
for the data that have been analyzed, no 
implication is warranted extending the ap- 
plication of these results to other types 
of data. It is not possible to infer that 
a 30 percent sample of any group is suffi- 
cient to yield reliable results for other 
types of data. Each situation must be 
tested separately. 
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THE INTERRELATION OF D, V, T, AND P SCORES 
by 
J. DeWitt Davis 
Director of the School of Education 
Texas College of Arts and Industries 
Kingsville, Texas 


More and more the use of objective 
measurement is being relied upon in all 
phases of the educational process. The set- | 
ting up of objectives for teaching, the 
study of individual differences in order to 
know how each student can be best guided in 
the school program, the measurement of out- 
comes in terms of pupil changes, all of 
these procedures require statistical anal- 
ysis. One of the phases of this analysis 
involves the comparisons of scores. That 
is to say, scores made on one test require 
comparison with those made on other tests 
when the two series of scores in their 
first state are not directly comparable. Ad-| 
vance textbooks to date have touched upon 
some of the factors here considered. The 
purpose of this paper is to bring some of 
these procedures together into one brief 
treatise for more ready reference. 

To do this effectively requires first 
a statement of meaning involved in the word 
comparable. For the specific consideration 
here presented this implies a common cen- 
tral tendency and a common deviation unit. 
In most cases the population being examined 
and compared one with the other can be rep- 
resented as a normal distribution, and for 
that reason the functions of the normal 
curve are involved in the discussion. With- 
out sufficient numbers and adequate 
sampling to justify this assumption, exten- 
sive statistical study of scores can hardly 
be justified. To facilitate understanding, 
the following brief statements are given 
concerning the meaning attached to the va- 
rious scores which are deemed comparable. 

D-scores are variously called standard 
deviation scores, Z-scores, X/co, and d/o 
scores, D-scores are all directly compara- 
ble because each D-score distribution has a 








| mean of zero and a standard deviation of 


one. 

V-scores are comparable, having been 
converted from a given series with its om 
central tendency and deviation measure to 
values in another series with a different 
central tendency and a different deviation 
value. 

T-scores are comparable, having been 
reduced to a common series that has a mean 
of 50 and a standard deviation of 10; that 
is, they are transformed to a given stand- 
ard in central tendency and dispersion. 

P-scores, or percentiles, represent 
points on a given score scale below which 
a certain percentage of the total popula- 
tion measured by the scale lies; 6.g., P- 
score 31 means that 31 percent of the popu- 
lation lies below, or 69 percent above, the 
raw score involved. Because they locate 
relative positions within a given group 
they too are comparable scores. 

These particular symbols are proposed 
merely to facilitate their recall. They 
all sound alike--D, V, T, P--and, in fact 
have much more than sound in common. Each 
possesses, moreover, peculiar advantage for 
certain analyses. Each letter used as the 
score name may be thought of as the in- 
itial letter of the word that characterizes 
its unique meaning. Thus D stands for "de- 
viate” on the base line of the normal curve 
involved. V stands for “vertere", Latin 
meaning to turn, to change from one to an- 
other. T stands for "transform", to make 
over from a given raw series to a standard 
one having a mean of 50 and a sigma of 10. 
P stands for "percentile", or point on a 
given scale below and above which certain 
portions of the total population involved 
are distributed. 
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Whenever raw scores are reduced to a 
comparable basis certain assumptions are 
made. (1) The traits being measured are 
supposedly normally or comparably distrib- 
uted; (2) the measures employed are compar- 
ably reliable; (3) the cases involved are 
adequate in number and selection to repre- 
sent a random sampling. Given approximately 
these conditions one is probably justified 
in reducing his scores to one of the pro- 
posed comparable bases. 

D-scores: In a normal distribution 
which can be represented graphically by the 
Jaussian curve each score may be thought of 
as a locus on the base line of that curve. 
The perpendicular to the base line that cuts 
the normal area into two equal parts is at 
the mean and its base-line value ina dis- 
tribution of D-scores, is zero. Perpendic- 
ulars erected +lo to the right and -lo to 
the left of this zero point include approx- 
imately 68.25 percent of the total area of 
the curve. That is to say, the unit of 
base-line variation is one sigma and +3 of 
these sigma units comprehend approximately 
99.7 percent of the total curve area. With 
these things in mind one can proceed to re- 
duce any series of raw scores to the D-score 
basis. The comparability rests in the fact 
that the means have been made equal to each 
other by transformation to the D-score 
basis. The mean scores, under these condi- 
tions, in normal curve-base-line units are 
equal to zero, and the sigmas are each equal 
to unity, for the same reason. To convert 
a score or a series to the D-score basis 
the following formula is employed. 

2X-M la 
Ds Ox & Ox 

In this formula Dx equals the D-score 
corresponding to X, its raw score; and Mx 
is the arithmetic mean of the X series; and 


=d 
N 


in which the small, or case letter d repre- 
sents the distance in raw score points above 
or below the group mean; that is, d=X =- M,. 
V-scores: One may wish to make a se- 
ries of raw X-scores directly comparable to 
another series of raw scores Y, secured by 
the same individuals or their controls on 


(1) 


Cy 
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another scale, which has a different mean 
and a different variation unit. The formula 
useful in this case involves the same as- 
sumptions as does that for D-scores. Part 
of the technical procedure is similar also. 
A statement of the formula shows this to be 
true. 


Vx = My + Oy Dy (2) 


In this formula Vx, equals the con- 
verted X-score in terms of the Y series, 
which central tendency is My and which de- 
viation measure in raw score points is o,. 
D, is the same as that used in formula (1) 
above. If only one score is wanted, D, in 
the form of 

X = M, 
Ox 

is found, and formula (2) then can be em- 
Ployed as given above. If the whole X se- 
ries is to be converted, time can be saved 
by reducing the formula to a simpler state, 
in which form it will require fewer .mathe- 
matical computations, as follows: 


Since D, = ae (1) above 
I 


X-M,) 0 
Then oyD, of (2) = oy *) = 2 (X-Mx) 


r 





Equation (2) becomes then, by substi- 
tution, 
y= My + 2X - My) 
Oy 


In the right-hand member of this last 
equation there are now four constants, name- 
ly, My, oy, Mx and o,, all of which my be 
collected as follows and equated to C, a 
large constant, thus: 


Ox 
Equation (2) then becomes: 


Vv, =c+2x 


Oy 


but % is also a constant. Let this be 
oy 


represented as c'. 
We then get, 


ye =C + clx (3) 
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When a calculating machine is at hand 
this formula is very useful. Put constant 
c' in as the multiplicand for the whole se- 


ries, multiply it by each variable raw score | 


X, and add the respective products to con- 
stant, C. The total in each case is the 
converted score, V;. A converted score may 
be compared directly with its corresponding 
score Y, for now they have the same mean 
and the same standard deviation. 

T-scores: The only essential differ- 
ence among the D-scores, V-scores, and T- 
scores is that each series has a different 
mean and sigma. The D-scores have M = 0 
and o = 1, V-scores have M and o both 
equal to those of the series with which 
they are to be compared. T-scores, arbi- 
trarily set a mean and a sigma. generally 
at 50 and 10 respectively. Any other ar- 
bitrary mean and variation units might be 
used. These are suggested, not only be- 
cause others have employed them, but pri- 
marily because they are easy to work with. 

The formula to use in reducing scores 
to this T-score basis is built out of for- 
mula (2) as follows: Since M, = 50, and 
oy = 10, by arbitrary assumption, then by 
substituting in (2) 


T, = 50 + 10D, (4) 


As before, if only one T-score is de- 


sired, Dy in the form of a= eis found: 
x 

next multiply by 10, by moving the decimal 
one place to the right, then add this prod- 
uct algebraically to 50. Corresponding Ty 
scores from another raw series of Y scores 
may be found in the same way and thus di- 
rect comparison becomes possible, T, with 
Ty. However, as before, if the whole se- 
ries is tc be put on the basis of T-scores, 
labor can be reduced considerably by sim 
plifying the statement so that it will re- 
quire fewer mathematical computations, as 
follows: 

By substituting from (1) to find the 
term 10Dx, we get, 


10D, = os Sew 
° Oy 


| 
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But 20 nes a constant value. Let this 


Ox 
equal c'. Substituting further: 
10D, = c'(X - M,) = c'X = c'M, 


In this equation, c'M is also a con- 
stant in value, a product of two constants. 
The new mean 50 in equation (4) above is 
also a constant. If all three of these 
constants, are combined then the large con- 
stant, C, involved in equation (4) becomes: 

Cc = 50 o—’ = 50 - c'M, 
Ox 
Substituting in the right-hand member of 
equation (4) above, 


Ts, = c'X + (50 = c'Mx) = c'X +C (5) 


A calculating machine is useful at 
this juncture, as before suggested. The 
small constant, c', can be used as mlti- 
plicand in the machine. Multiply it by 
each variable X-score respectively and add 
to each product the constant C. The 
T-scores are thus derived, ready for direct 
comparison with all other T-scores derived 
in the same way. 

P-scores: It is seen from the above 
discussion that V-scores and T-scores are 
a direct modification of D-scores. P-scores 
ordinarily are derived in a different way 
and hence are seldom thought of as being 
related to any of the others. However, 
when sampling is large enough to justify 
percentile scores, that is, large enough 
to make the distribution of raw scores fair- 
ly stable, the P-scores can also be derived 
from the raw series through the D-score 
process. An understanding of the nature of 
D-scores, therefore, is a chief desideratum 
of comparable scores. The formula useful 
in this transposition can best be put ina 
combination of letter symbols and words as 
follows: 

P, = 1,00, (total area of normal 
curve), minus A, the portion of the normal 
curve area to the right of Dx on the base 
line, when D, is secured as in (1). 


P, = 1.00-A (6) 
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-25 more of the total area. Consult the 
table of curve areas and find that the Dz 
normal curve area table.+ When the raw or x/ox which corresponds to .2500 in the 
score is less than the mean of its distri- area colum is .6745. Since the P-score 
pution, A = .500 plus the figure in the _is below the mean (.50 is the mid value of 
area column corresponding to the D-score of | percentiles) this D-score is negative, or 
the raw X-score. When the raw X-score is -.6745. Therefore, Q, or R score of .25 
creater than its mean, A = .500 minus the = a Dy score of -.6745. 
ficure in the area columm corresponding to | Because Quartiles and Deciles are fre- 
the D-score of the raw X-score in hand. quently useful in analyzing raw scores the 
An illustration will make the pro- following table will be found a material 
cedure more clear. If a raw score in X se- | aid. It is compiled by using the method 
ries is 60 when the mean of X is 90 and the | employed in the example above. 
o, is 15, what is its P-score or Py value? | 


In finding the value of A the follow- 
inc rule will be helpful. First secure a 























By formula (1) 



































p = © « -2.00. | — 
| QUARTILES, DECILES, AND CORRESPONDING D-SCORES 

The normal curve area table shows that when eS ee 
in column x/o,, or the D-score, the value is || | | 
2.00, the figure in the corresponding area | oe 1” i | . 
column is .4772. What do these figures sig- | aenttnt “ ” 
nify? In the first place this D-score is = {| _-————_____ es ae —| 
below or to the left of the mean, because | Qa | +25 =| -.6745 
60 is less than 90. Above the mean is SO | <3 | “75 | OME 
percent of the area. Between the mean and || D, = | -1.2817 
the -2.00, D-score, is 47.72 percent more | Dea | 20 | -.8418 
of the area. By formula (6) then, there is om e ee 
.50 + 47.72 or 97.72 percent of the area or || D. "50 | "0000 
number of cases involved above this score. De -60 | +2533 
That is to say, A in formula (6) is 97.72 Dy 70 | 5344 
percent or .9772. Therefore, in this case, De 80 8416 
we have Py = 1,00 - .9772 = .0228 = 2.28 L De | +90 | 22617 
percentile.* The adequacy of these relationships de- 

To find a given P-score in terms of pends upon the appropriateness of the basic 





D-scores the same process can readily be re- | assumption involved; namely, that the group 
versed. For example, given a P-score of .25,| being considered is fairly normal. Fron 
or Q,, as it is frequently called, what is this discussion it is hoped that the reader 











its value in the D-score series? By sub- will be better enabled to make his differ- 
stitution in (6) above: ent series of scores comparable. If he is . 
25 = 1,00 -A given the group mean and the group sigma, 
A, then = .75 of the total area. any or all of the scores of a given series 
But when Dx = 0, 5O percent of the can be reduced by the suggested methods. 
total area is to the right of that point. Or, given the scores and their Ph, the Dx ' 





Therefore, D, must be less than O, or nega- | can be computed. From Dx the Vz or Tx may 
Z tive, to the left of O on the base line suf- | be computed by employing the formulae here- 
i ficiently far to comprehend .75 - .50 or in derived and presented. . 



















l. Morton, Robert Lee, Statistical Tables,—Sew York: Silver Burdette and Co., pp. 44-47, or any other normal curve 


area tabulation. 
2. Note that in speaking of percentiles frequently the decimal is omitted. A P-score of .0228 is the same as per- 


: centile 2.28. 
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THEIR CONSTRUCTION AND USES* 


by 
Hal D. Draper 
Fresno State College 
Fresno, California 


It has become widespread practice 

in the field of education to em 

ploy the normal probability curve is assign- 
ine course “grades” or “quality marks"; a 
more or less standard distribution for nor- 
mal classes being: 7% A's and F's (fail- 
ures); 24% Bts and D's; and 38% C's. This 
distribution of grades is based on the fol- 
lowing application of the probability curve, 


the equation for which is y= YN 
-x. 
a) 20% 


where y, is the height of the central or- 
dinate, e is the base of the natural logar- 
ithms, and o is the so-called standard de- 
viation. Taking the central ordinate at 0 
(zero) on the base line, distances to the 
right being positive, ordinates are erected 
at x = 21/20, and at x = 43/20. The letter 
grades are assigned to intervals along the 
base line as indicated in the upper scale 
of Fig. 1, page 185. 

The area between the curve, the base 
line and the two ordinates is a measure of 
the number of scores falling in each grade 
for a “normal distribution", and yields 
38.30% for C, 24.17% each for B and D, leav- 
ing 6.68% each for A and F, While as indi- 
cated in Fig. 1, it is necessary to proceed | 
from -@- to +@in order to include 100% of 
the scores in @ theoretical array, cutting 
off the base line at 22-1/2c0 neglects only 
1.24%, and experience shows that in an ac- 
tual array of test scores, these limits are 
rarely exceeded. Cutting off the base line 
at 23-1/3c neglects only 0.08% of a the- 
oretical array, and an actual array very 
rarely yields a score outside of these limits, 








The whole theory upon which the prob- 
ability curve is based shows that its use 
in educational work approaches validity on- 
ly for fairly large classes of non-special- 
ized students: to apply it to small class- 


es, or to advanced classes--whether large 


or small--probably is not justified. In 
the hands of a teacher who recognizes its 
limitations, however, it can be used as a 
valuable guide in assigning grades even in 
small or advanced classes. In this discus- 


| sion, it will be assumed that the curve may 


be applied legitimately. 

While final grades are quite generally 
reported in the above letter-grades, or 
their equivalents, standings on individual 
tests are frequently given in terms of 
"thirds" of letter grades, (e.g., C-, C, 
and C+) by dividing each o-interval on the 
base line into thirds as indicated in the 
middle scale of Fig. 1. Many teachers feel 
that the validity of their tests enables 
still finer gradations of quality to be dis- 
tinguished. The writer proposes the adop- 
tion of a suitable Standard Score Number 
Seale (SsNS) to meet this need. The SsNS 
is merely a set of numbers applied to an 
equally spaced scale along the base line of 
the curve with the zero at a suitable dis- 
tance to the left of the central ordinate, 
which represents a "middle C" grade, and so 
adjusted that divisions between thirds of 
letter-grades will fall halfway betwee: 
numbers on the scale. 

In addition to the advantage of provid- 
ing a means for expressing any desired de- 
gree of gradation in quality within a let- 
ter-grade, such a SsNS, by reducing 





l. References to the literature have been largely omitted from this paper. In «a paper entitled "Marks and Marking 
Systems: A Digest", Jour. of Educ. Research, IXVII (December, 1933), pp. 259-272. A. Duryee Crooks has prepared 
a rather extensive bibliography covering this field to which the reader is referred for more detailed information. 
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Fig. 1. The normal distribution, letter-grades, and standard score norms 







everal different test scores to the same fused with "percentage-of-achievement" 

asis of comparison, enables one to compute marks. Teachers (and students) who are 
quantitatively a student's average familiar with the rather widely used "per- 
“weighted” if desired) by standard meth- centage-of-achievement-scale" which makes a 






ods--a task which is almost impossible when test score of less than 60% an F; 60-69% a 
letter-grades are used. For example, take D; 70-79% a C; etc., are disturbed to find 
the case where a student's average is to be | that S.-numbers derived from certain poorly 










jetermined from the results of two mid- chosen S;NS's, have values widely divergent 
term and one final examination, the final from percentage marks--a definite "positive 
to count twice as much as a mid-term. If achievement" may receive a negative Ss-num- 
test grades such as C, Bt and C+ are re- | ber on the one hand, or a very low percentage 






corded for the student, it is difficult to | mark may receive a disproportionately high : 
jecide whether his average should be a Be | Ss-number on the other. 

or a C+, By adopting a S;NS such that, say | It must be admitted that when there is 

57.5 < C < 63.5 < C+ < 69.5 < Be < 75.5 a large discrepancy between the S.e-numbers 








er 


< B+ < 81,5, and recordine the three scores | and the empirical scores (Sg-numbers) on a 
as 62, 80 and 69, enables one to determine | test, a bad psychological effect may be pro- 


| 







the "weighted average" as follows: duced in the student. Thus, if on a test 
(62 + 80 + 2 x 69) + 4 = 70, which gives a (an Sg of, say, 30 is transmuted into a Sz 
B- as the correct average. If the three of -5, the student is likely to be resentful ' 
test grades are 58, 76 and 69, however, and puzzled to account for a definite "posi- 






the weighted average is only 68, giving C+ | tive achievement" yielding a negative score. 
as the correct average grade for the stu- (On the other hand, if the Sg of 30 yields a 
dent. Another method for determining the Ss of 65 (which still may be a "failing" 
weighted average of a series of Ss-numbers | grade), the student is likely to feel that 
will be given later. he has been "pretty lucky" to get a mark of 
In constructing and using a S;NS, sev- | more than twice the number of points answered ; 
eral principles must be thoroughly under- on the test. -In either case, he is not en- 
stood. First, it must be strongly empha- couraged to put forth greater effort in his 
sized that S;-numbers are not to be con- subsequent work. ) 
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In using a SsNS, therefore, it must be | the term in brackets positive even when S; 
borne in mind that to overcome the difficul- |= 0. As pointed out above, scores in an ac- 
tual array very rarely fall below S; 
signed to fit the SsNS chosen. After discus- | = - 3-1/30: setting the zero on the S-NS at 
sing the principles involved in constructing | this point on the base line of the curve, in 
the S;NS, we will be in a position to deter- | the writer's experience, has been found to 
mine the nature of the tests which will fit | give a very satisfactory scale from the 


ties mentioned above, each test should be de- S 


the scale. | standpoint of rarely yielding negative Sz's, 
If Ms is the Ss-number which is to rep- | Of course, setting the zero on the SsNS fur- 
resent a "middle C" grade; My the mean |ther to the left will still further decrease 


(average) of the Sg's in a test (which is to | the possibility of obtaining negative S,'s, 
be set at the "middle C” grade); nthe num- | but from the relationship between Ms and n 
ber of degrees of gradation in quality to be | in Equation 2, this requires an undesirably 
distinguished in each third of a letter- _large value for Ms or too small a value for 
grade, (1.e., 3 is the number of grada- Ne 

tions within an interval of lc on the base By adopting -3-1/30 for the zero point, 
line of the curve); then in a test for which |we thereby fix the value of Ms = 107: 

the standard deviation, o, has been deter- Equations 1 and 2 then become 

mined, the standard score, Ss, corresponding 
to any given empirical score, S,, on the Ss 
test is given by the equation 


3 
10n + =~ (Sp - Mg), and (4 


Zn | Ss = == Sep - (Mg - 3-1/30) | (5 
Ss = Ms + —— (Sz - Mz) (1) 6 
An examination of Equations 1 to 5 
This may be rearranged to give a form more shows that only two of the three terms, Mz, 
convenient for computing the standard scores n, and the O-point on the SsNS, can be ar- 
on a calculating machine as follows _bitrarily fixed: having chosen the zero-point, 
Zn ' | we may set either Ms or n, but not both. Thus 
n Mso 
Ss * —|Se - (Me - Meo) (2) | if it is desired that M, should be set at, 


say, 70, then n= 7, i.e., all tests which 
Equation 2 shows that the zero on the SsNS_ | are designed to fit this SeNS should be 
will correspond to an empirical score capable of distincuishing valid gradations 
Mo of quality of 7 points within a third of a 
Sg = Mg a (3) | letter grade. This requires (as shown be- 
low) that each test should be capable of 
Now, both from the psychological rea- giving not less than 140 distinct scores.+ 
sons mentioned above, and the practical con= | These equations tell us further, that 
sideration that negative numbers are more | the points on the S,NS separating the 
difficult to handle in computing averages, | thirds of letter-grades fall at odd multi- 
negative standard scores are undesirable. To | ples of n/2: in order, therefore, to have 
avoid obtaining negative Ss's, we see from these points fall halfway between numbers 
Equation 2 that the term in the square on the scale, for even values of 7, it is 
brackets must never have a negative value. necessary to add or subtract 1/2 to the 
This implies that we must make Ms/n large left-hand side of Equations 4 and 5. For” 
enough so that (Mso)/3n is practically al- an even number, then we will use the fol- 
wayS numerically larger than Mg; thus making lowing equations 








l. Many teachers feel that they are increasing the capacity for distinguishing gradations in quality of a test having 
a small number of items by giving a large number of "points" for each item: this is entirely erroneous. If a test 
contains, say, 14 problems for each of which 10 "points" are given, with a possibility of obtaining "half credit", 
instead of yielding a test capable of distinguishing seven gradations within a third of a letter-grade, only 28 
distinct scores are possible, making for only one (or at most, two) valid gradations within « third of a let- 


ter-grade. 
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3n 


10n 21/2 + > (Se 7 Mg), or = (6) 


s,s 212+ "13, - (Mz - 3-1/30)) (7) 


Tables I and II, page 188 give the 
values of the Ss-numbers which fall between 
the thirds of letter-grades, at the mid- 
point for odd and even values of n, respec- 
tively. These are taken directly from 
squations 5 and 7. Values of n= 5 and 
n = 6 are also shown as an illustration. 
These are the values that the writer recom- 
mends as being the most suitable for the or- 
jinary examination, odd numbers being pref- 
erable to even, in general. 

We are now in a position to discuss 
the method of adjusting a test to fit the 
particular S,NS which may be adopted. 
First, the number of items in the test 
should not be less than enough to provide 
about 20n distinct scores (see footnote on 
page 186). Second, the difficulty of the 
{tems should be such that the poorest stu- 
ient will get some score, whereas the best 
student is not likely to answer more than 
about 90% of the items. This implies that 
the teacher has some knowledge concerning 
the abilities of his students, and also 
some information about the difficulty--from 
the students! standpoint--of the items he 
proposes to use in the test. For a teach- 
er who has considerable experience with 
teaching a given course, the use of objec- 
tive type tests which have been employed a 
number of times in preceding classes fur- 
nishes the best means for attaining this 
second objective. 

It follows from the above that a test 
designed to yield from 90 to 120 distinct 
scores cannot be expected to yield valid 
gradations of more than four to six points 
within a third of a letter-grade(nm = 4 to6). 
In the writer's experience, with True- 
False type tests, where the score is ob- 
tained by subtracting the wrong answers 
from the right ones, a larger number of 
items must be provided to get the same 
rradation--between 25 n and 30 n has been 
found satisfactory. 

Many teachers have adopted the prac- 
tice of giving frequent short quizzes of 
from 15 to 25 items, supplemented by one or 
more mid-term examinations and a final ex- 
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amination. The course grade is then deter- 
mined from these scores by some empirical 
method of "weighting." As pointed out 
above, the use of a S<NS enables one to 
weight the scores in a quantitative--though 
not strictly objective--manner, since near- 
ly any method of weighting involves a large 
subjective element. It must be realized-- 
as many writers have pointed out--that as- 
Signing equal value to each item in a test 
involves in many cases a tremendous sub jec- 
tive weighting. 

One of the methods of weighting test 
scores in computing an average has already 
been described, i.e., expressing each test 
score in terms of the same SsNS (which we 
will designate the Basic S;NS), and then as 
Signing an arbitrary weight to each test. 
The general formula for obtaining a "weighte 
mean" from k SCOreS, 8,, Sg, G3, «++eAx, 18 


She te (8) 


where w, is the weight of the i" score, and 
= signifies taking the algebraic sum of the 
terms to the right of the symbol. 

Another method that is quite satisfac- 
tory, especially for those who give short 
quizzes together with longer examinations, 
is the following: for the short quizzes, 
involving from 20 to 25 items, express the 
Ss's in terms of a SsNS taking n = 1. For 
the mid-term examinations, express the score 
in terms of a SsNS taking n= 5: similarly, 
the final examination may be expressed in 
terms of a SsNS taking n = 10 or 20, de- 
pending upon the number of items involved, 


i.e., the number of distinct scores possi- 
ble. The mean of these scores is obtained 
by the equation 
Map XZ Aa 
= ny 


where ng is the value of n taken for the 
Basic SsNS. 

While this equation gives exactly the 
same results as that of Equation 8 when all 
scores have beeri reduced to the Basic S;NS, 
the latter method is better adapted to the 
use of teachers who employ the "running to- 
tal" method of recording test scores. This 
practice is to be commended since a 
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Fig. 1. It is to be emphasized, however, that the adoption of such a SsWS as is here recommended will take care of 
such unusual scores without difficulty. 


TABLE I TABLE II 
FOR ODD VALUES OF n FOR EVEN VALUES OF n 
| : i 7. € a 
F Ss Between Letter Sg at the Ss Between ered Ss at the 
Letter-Grades ;|Grades | Mid-point Letter-Grades irades | Mid- Scent 
Ponti ace Hate Dc PPE 96h ee —+—-——— 
m=5 | m=5 m= 6 | R2z6 
«© Ob enced | - 0.5 n+ 1/2+—+ 
F----+ On 9) | F--—}+ On+i/2 0.5 
2.5 0.5 n+— 3.5 0.5 n+ 1/2+——+ 
P-.- lan 5 [Preo- > las 1/2 6.5 
7.5 1.5 at , | 65 1.6 8+ 1+ 
mR | 2a 10 |P-- | 2@nril/e 18.5 
12.5 2.5 n}—___| | 15.5 2.5 n+ 1/2+——~+ 
F- _ a 15 | | F- - Sa+1/2 18.5 
17.5 3.5 24+——_+1 | 21.5 3.5 m+ 1/27 
F + 4n 20 | F tr @n+1/2 24.5 
22.5 4.5 n+— = | 27.5 4.5 n + 1/2+——+ 
F+ - $e 25 |F+ - 5n+1/2 4&0 
27.5 5.5 n+—— $3.5 5.5" + 1/2+— ; 
. 67 30 | + 6n+1/2- 36.5 
32.5 6.5 a+—— | 39.5 6.5 n+ 1/2+———+ 
D L 77” 35 | | D - PFre+ iff 42 
| 57.5 7.5 n+— 4 45.5 7.5 n+ 1/2+——— 
Dt + BR 40 | |Dt - 8n+1/2 48.5 
42.5 8.5 n+ 4 | 51.5 8.5 n+ 1/2+ a 
C- —" ve ! }C- - 9n+ 1/2 54.5 
f 47.5 9.5 s+—— { | 57.5 9.5 mn + 1/2+——~+ 
Cc -10 7 50 | 1c + lOn+i1/e 60.5 
52.5 10.5 »+—— | 63.5 10.5 m+ 1/e+——-4 
ce o£ ln 55 lice - lln+i/2 66.5 
| 57.5 11.5 a+———1 | 69.5 1l.5 + ts “4 
B- ; 12” #860 lge@n+1/2 72.5 
e| 62.5 12.5 nt——— 75.5 12.5 0+ vel 
B + 139 65 | B + 1382+1/2 176.5 
67.5 13.5 n+ vend 81.5 15.5 n+ Me 
1 Be | 14n 170 /B+t | l4én+i1/2 84.5 
fi) 72.5 14.57 87.5 14.5 n+ 1/a}——1 
- 1527 75 - 1n+1/2 90.5 
! 77.5 15.5 np+——_—_ 93.5 15.5 n+ ver 4 
A - 16” 80 A + 16n+1/2 96.5 
: 82.5 16.5 7 99.5 16.5 n + 1/e-——-+ 
, A+ + 17.n 85 A+ + 17 n+ 1/2 102.5 
87.5 17.5 a+—<+ 105.5 17.5 m + 1/2+——- 
tM 1g 7 90 | a++ - 18 m+1/2 108.5 
: 92.5 16.5 n+— 111.5 18.5+1/2 
t A+++ 19 7 95 | A+++ | 19M+1/2 114.5 
97.5 19.5 nj——— 117.5 19.5 n + 1/2+——— 
‘ A+++++ (20% 100 | At++++ 20 n+ 1/2 120.5 
102.5 20.5 mlL__ 123.5 20.5 n + 1/e:——! 
t 
The relation of these Sg-numbers to the Normal Curve is shown in the lower scale in Fig. Me 
The whter. does not advocate the otuinn of such letter-grades as At++++, Ft, F--, etc.: for properly designed tests 
grades of At+++, F----, etc., should be met with very rarely. A better practice, perhaps, is to list all Ss's below 
5.5m simply as « grade of F, and all scores above 16.5 m as A+: this practice is indiceted in the middle scale of 
; 
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student's standing in the class can be de- 
termined at any time with the minimum at ef- 
rort. For example, if a student has the 
successive Ss=-numbers on quizzes (nm = 1) of 
, 14, 12, 9; 53 on a mid-term (nm = 5); and 
14, 10, 10 on quizzes, his standing at any 
time can be recorded by summing up all the 
test scores to date. This vields the suc- 
-essive "running totals" of 8, 22, 34, 43, 
96, 110, 120, 130. If these scores are 
made available to the class, each student 
san quickly ascertain his standing relative 
to the class at any time, and by multiplying 
his total score by the current value of 
np/inmy, he can determine--approximately at 
least--his letter-grade. Thus, after the 
mid-term examination in the example above, 
ny = 9, and if ng = 5, the student's score 
is found to yield (5 x 96)/ 9 = 53, which is 
a C+ grade according to the results in 
Table I. 

While the SsNS will be especially use- 
ful for objective type tests, it should be 
adaptable to use by teachers who adhere to 
subjectivelv scored tests on a “percentage- 
of-achievement-scale." Unfortunately, many 
teachers have not attempted to apply sta- 
tistical methods in their classes on ac- 
count of the rather formidable appearing 
mathematics involved. 

To overcome this difficulty, the writ- 
er has devised a simple form, The Draper 
Histogram and Standard Score Form--soon to 
be published--which combines the advantages 
of algebraic and graphic methods in deter- 
mining, by the normal curve, the standard 
scores for an array of empirical test scores. 
vith this form there is no necessity for 
preparing a "distribution chart" for every 
test: tally marks are entered in a histo- 
eram form (providing for a range of 100 
points) opposite a scale giving the empiri- 
cal test score. (This S,-scale is show 
alone the left margin of the Histogram Form 
in Figs. 2 and 3.) Simple and explicit di- 
rections are given for computing by standard 
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methods’? the (approximate) median, mean, 
standard deviation, o, and the mode for the 
array. Even one who is unfamiliar with the 
theory of statistical methods will have no 
difficulty in carrying out the computations, 
since each successive step is clearly indi- 
cated. 

The greatest time-conserving feature of 
the method, however, lies in the rapid, 
graphic assignment of Ss-numbers and letter- 
grades. This is accomplished by means of a 
Standard Score Assignment Form (SsAF), the 
form with the radiating lines shown in 
Figs. 2 and 3.5 The central horizontal 
scale running from 0 (on the right) to 25 
refers to the value of o as determined for 
a particular test. The vertical scale at 
the left of the S,AF is the Basic SsNS, 
which in the illustration is taken with the 
O-point at -3-1/3c, and with n = 5. 

The method of using the form in as- 
signing Ss's for a test in which the dis- 
tribution is approximately normal (a point 
which can be determined by a glance at the 
histogram for the array) is illustrated in 
Fig. 2. For this particular test, the mean 
was found to be 56.90, and the o = 18.35. 
The Histogram Form is therefore placed with 
the edge perpendicular to the central scale 
on the SsAF, and with 56.90 on the S_-scale 
of the HF coinciding with 18.35 on the cen- 
tral scale of the SsAF. The points on the 
S,-scale where the radiating lines on the 
SsAF intersect then give the S,'s corre- 
sponding to the S;-numbers represented by 
these radiating lines. Thus, it can be seen 
from the figure that the empirical score of 
97 yields a standard score of 74, which is 
an A-. 

With these same forms, standard scores 
can be assigned to "skewed" arrays with a 
minimum of empirical assumptions, and these 
justifiable on rational grounds. The ap- 


plication of the forms to this purpose is 
illustrated in Fig. 3, (the same test being 
which obviously 


used as in Fig. 2), 





. Rugg, H. 0. Statistical Methods Applied to Education, Houghton, Mifflin Co., (1917). 
Lang, A. R. Modern Methods in Written Examinations, 


Houghton, Mifflin Co., (1950). 


5. The Histogram Form shown in these figures is a crude model prepared on a mimeogreph which the writer has been using 


for several years. 
this paper. 


radiating lines connecting the points on the ScNS were inserted only in the "A letter-grade" region. 
sized forms, the HF is 8-1/2" x 11", and the SsAF is 17" x 22". 


The ScsAF shown was hastily prepared end drawn to half-scale in order to provide the cuts for 
Owing to the reduction in scale, the scale divisions on the horizontal scales were omitted, and the 


On the full 
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Fig. 2. Illustrating use of Draper Standard Score Assignment Form «nd Histogram when distribution is considered normal 
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Fig. 5. Illustrating use of Draper Standard Score Assignment Form and Histogram when distribution is considered skewed 
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is skewed downward slightly. 

The assumptions that are made in treat- 
ing moderately skewed arrays of test scores 
are these: 

1. The “theoretical mode” is a better 
measure of “central tendency” than is the 
mean or the median. The mode, calculated 
by Pearson's empirical rule, Mo = M - 3 
(M = Md), is therefore taken for the “mid- 
dle Cc" grade. 


sf z= ¢ 
2. Acz- 


2 

— where the deviations, 
d, are computed from the mode should be 
utilized instead of the o as ordinarily com 


puted from the mean. 
3. The total range of Ss's from the 


bottom of the F=- to the top of the A+ --1.6., 


from 2.5 n to 17.5 n--should be taken as 
5 ot. (As already mentioned, scores out- 
side of 22.50 are rarely obtained.) 

4. A 100-percentile skewness index is 
computed as follows: Sk.In.,,, = 


A - Mo 
Mo - B 


where A is the highest and B the lowest em 
pirical score obtained in the test. 

5. The ratio of the Ss-interval from 
the top of the A+ (17.5 n) to the "middle 
Cc" grade (10.0 n) to the interval from the 
"middle C" grade to the bottom of the F- 
(2.5 n) should be made equal to the 
SK. In. 6: 

6. The intervals on the SsNS should 
give uniformly increasing intervals on the 
Se-scale throuchout its length. 

It can be seen that the application of 
the above principles to the assignment of 
Ss's to a skewed array resolves itself into 
a rather simple geometrical problem when 
the writer's forms are employed. Thus, 
placing the HF on the SsAF with the edge of 
the former at an angle to the central scale 
of the latter (the angle e in Fig. 3) will 
give uniformly increasing intervals on the 
Se-scale throughout the length of the 
Ss-scale (Point 6 above), and obviously the 
interval from the top of the A+ to the "mid- 
dle C" grade can be adjusted with any de- 
sired ratio to the interval from the "mid- 
dle C" grade to the bottom of the F- by se- 
lecting the correct value for the angle 6. 
(Point 5.) 
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It is also geometrically obvious that 
rotating the HF through the angle 6, keep-~ 
ing the edge of the former at a fixed di- 
vision on the central scale of the SsAF, 
will increase the interval on the Sg-scale 
between 2.5” and 17.5 n on the Ss-scale. 
This may be compensated for by translating 
the HF to the right until the left-hand 
edge coincides with some definite division 
(o") on the central scale--less than o!, 
Principle 1 may then be complied with by 
bringing the mode on the Sre-scale to the 
central line on the S<cAF. 

Spaces are provided on the HF for con- 
puting of (Point 2), the 100-percentile 
skewness index, Sk.In.,,,, and the value of 
o”. By means of the "Sk.In.-scale"--the 
scale along the right-hand margin of the HF 
--and the upper horizontal scale on the 
SsAF, the angle 6 is graphically determined. 

In the illustration given in Fig. 3, 
the mode, Mo = 56.45, of = 18.35, and 
Sk.In.,9, = 0.91 (which shows that the dis- 
tribution is skewed downward slightly). A 
mark is then ruled at 0.91 on the "Sk. In.- 
scale", ard crossing the adjacent "R-scale” 
at 0.98. This gives the factor by which o! 
must be multiplied in order to give o" 
(18.35 x 0.98 = 17.99 = 0"). 

The HF is then placed with the "Sk. In.- 
scale" coinciding with the upper horizontal 
scale on the SsAF with the 1.0-division on 
the former at 17.99 (ot) on the latter. A 
light pencil mark is made on the S;AF-scale 
where the (previously marked) 0.91-division 
on the "Sk.In.-scale" falls, and a light 
pencil line is ruled through this point and 
the 17.99 (o') division of the central 
scale of the SsAF. When the left-hand edrce 
of the HF is then placed on this pencil 
line, and with the mode, (56.45) on the 
Sg-scale coinciding with the central scale 
line, the situation shown in Fig. 3 is ob- 
tained. The graphic assignment of Ss;-num- 
bers and letter-grades is then made as be- 
fore. It will be noted that the total 
range of Sg's from the bottom of the F- to 
the top of the A+ is quite close to 5 of! 
= 91.75, and that the ratio of the intervals 
from the top of the A+ to the "middle C", 
and from the “middle C" to the bottom of 
the F=- is quite close to 0.91, the 
Sk.In., 9 - 
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The effect of considering the skewness | 
in this test, and using the mode rather than| 
the mean for the "middle C" grade, has been | 
to raise slightly all of the Ss's for the 
array, the extreme scores being affected | 
more than those near the center of the dis- | 
tribution. In addition, the rather desir- | 
able result is obtained of having the top of | 
the A+ interval brought much closer to the 
maximum possible Sg on the test than is the 
case when the skewness is neglected. This | 
result is brought about in a large propor- 
tion, if not a majority, of the cases which | 
the writer has encountered, though the very | 
close agreement in the present illustration | 

| 
| 


is rather fortuitous. 

The writer has adopted a practice 
which still further diminishes the labor in- 
volved in reporting test grades to his 
classes in that it obviates the necessity of| 
preparinre separate lists of grades for post-| 
ince. In addition, it has had a stimlating | 
effect upon the students in promoting a | 
nealthful rivalry between the members of 
each laboratory section, and between the 
different sections as a whole. At the be- 
ginning of the semester, each student is 
assigned a serial number which identifies 
the section of which he is a member as well | 
as his position in the section. Thus the 


first member in Section 1 is given the num | 
ber 101, the second member, 102, etc. When | 
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a test is given and scored, the last one or 
two digits of the student's number is in- 


| serted in the correct “histogram block" in- 


stead of using a tally mark, or “blocking 
in" as show in the figures. Different col- 
ored pencils are used to distinguish the 
sections. 

When the Histogram Form is completely 


| filled out, and the lines ruled across be- 


tween the thirds of letter-grades as shown 
in the figures, it is posted on the (locked) 
bulletin board. Each student, by locating 
his serial number in the histogram, can 
quickly determine his empirical score, his 
letter-grade, and by interpolation, his 
standard score in the test. In addition, 
he can see what all the other students have 
done on the test and estimate his achieve- 


| ment compared to the class as a whole. With 


the histogram before him, and with the 
knowledge that the assignment of standard 
scores is the result of an objective appli- 
cation of mathematics which treats all] the 
students alike, he is much less likely to 
feel that he has been discriminated against 
if his score is not a good one. In the 
past several years, since adopting this 
procedure, out of over a thousand freshman 
students passing through the writer's 
classes, scarcely a dozen have ever seri- 
ously questioned the fairness of the grade 
he received. 
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A METHOD OF PROVIDING A MORE VALID DISTRIBUTION OF SCHOOL MARKS 


by 


R. W. Edmiston 
Miami University 


INTRODUCTION 


The school mark which remains upon the 
permanent record should be as valid an es- 
timate of the achievement recorded as_ the 
teacher can possibly provide. Improved 
measures offer means of providing more ex- 
act evaluation. It is hoped that a more 
scientific professional training will elim 
inate any tendency for the teacher to con- 
sider other than achievement in arithmetic 
when determining the mark in arithmetic. 

The distribution of marks according 
to percents said to be derived from the 
normal curve is familiar to the educational 
profession, While the percents used in 
the distribution differ, a common one of- 
fers 10 percent of the number of pupils 
the highest mark, A, 20 percent B's, 

40 percent C's, 20 percent D's, and 

10 percent F's. This particular distribu- 
tion will be designated as the normal curve 
method in the further discussion. Small 

or selective schools and classes do not 
provide normal groups. The fact that a 
larger group attains the norm on a test 
does not assure a normal distribution. Only 
carefully standardized tests provide data 
for direct location of marks with respect 
to some norm. These tests are available 

{n some elementary school subjects. Since 
many marks are based upon the results of 
informal tests, a substitute for the normal 
curve method of distribution is desirable. 
This substitute will be called the "Stand- 
ard Deviation Method of Distribution of 
Marks"; it is an attempt to apply statis- 
tical knowledge more correctly and could be 
designated “Improving the Use of the Normal 
Curve in Markinc." 





| 


| 
| 
| 





STATISTICAL CONSIDERATIONS 


Since the divergence from the mean 
score and not by number of scores is the 
measure desired, the standard deviation dis- 
tances from the mean is a more direct basis 
for marks than the corresponding percents 
of a normal group. Slight changes of the 
percents previously given are made to con- 
form to standard deviation distances.! Thus 
the 10 percent A's become scores 1.30 or 
further above the mean; the 21 percent B's 
are scores between .5 and 1.30 above the 
mean; the Cts or 38 percent adjacent to 
the mean are scores including -.50 to .50; 
the 21 percent D's are scores between 
-1.30 and -.50 below the mean; the 10 per- 
cent F's are scores -1.30 or further below 
the mean. The test scores of any group may 
be converted into comparable form by apply- 
ing to each set of scores the formula, 


Cesc = 50 + 20S) 


where S is the individual's score, M the 
group's mean score, and o the standard de- 
viation. The mean of each converted set of 
test scores will be 5O and the standard de- 
viation will be 10. Thus 1.30 or more above 
the mean becomes 63 and above; similarly 
converted scores of 51 to 62, 45 to SO, 

38 to 44, and 37 and below represent B, C, 
D, and F marks, respectively. 

A beginning teacher has no standards 
from previous pupils' work upon which to 
base an estimate of the location of the 
mean of her present group's achievement. 
After using a test with several groups, the 
mean to be assigned to a normal group can 
be estimated. All marks will then be 
































l. These divisions could be shifted at will and the method described does not remove the need of studies to determine 
more scientific criteria for the different school marks. 
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somputed from this mean rather than the 
mean of the group considered. Otherwise, 
he system described does not provide for 
scation of the mean as a mark of relative 
ability. If from the averare of a 
rroup's intelligence scores or former 
nievement test scores their mean {s esti- 
mated to be, for example -.2, of a standard 
ieviation from the normal mean, all marks 
san be determined according to the above 


“AID 


provisions after the mean has been corrected, 


y adding .2 of the standard deviation.} 


Jomputation According to This Plan 


SAMPLE COMPUTATION IN ORDINARY SITUATION 
WHERE MEAN IS NOT CONCERNED 


7 r 


Converted Score 





| 








d= q* C-se = 50 
Pupil Test | devietion | devietion , 20(S-m) 
Score | from mean squared Mark ad 
+ + + 4. 
l 30 | 35 1225 F 32 
2 32 33 Logg r 33 
: 40 2 625 D 37 
+ 42 | 23 529 [ 38 
44 | a] 441 D 39 
6 45 | 20 400 D 40 
20 490 D 40 
- 51 14 196 D 43 
53 2 144 D 44 
55 10 100 c 45 
59 | 36 c 47 
65 Meen 
Lg £6 1 1 c 51 
lg 68 2 9 v D2 
4 7 6 36 c 53 
75 10 1 = 55 
76 11 121 E 56 
7 77 12 144 B 56 
3 77 12 144 | iB 56 
} 81 16 | 286 SB CY 58 
| a2 | 16 | =e | CB 58 
2 82 ae 389 B 59 
22 ~ 23 529 B | 62 
28 90 | 625 | a | 63 
4 | 98 | 20 «|| 0 67am CO] CA 64 
2s | 7 | se = eh nee -oe 
N-25 | 5=1618] = =9508 T | etl 
d = test score - mean, often written S - M. 
_ Sum of test scores £5 
M = number of pupils * N 
M = = 64.72 or 65. N = No. of pupils 
= 25. £= Summation or Sum of. 
d = distance each score is from 65, the mean. 


} 


« 
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| Standard deviation or 


/sa? 
oy N 


x 19.5 = 9.75 or 10. 

| l.d@o = 1.3 X 19.5 = 25.35 or 25. 

| From -.50 to .Se from mean = scores from 
(65 = 10) to (65 + 9) or scores of 55 to 
74.2 These scores are given marks of C. 

| From -.50 to -1.30 = 25 - 10 or 15 figures 
| below 55 or 40 to 54, which scores are 

| marked D. Scores lower than -1.30, or 40, 
are marked F, From .5o0 to 1.30 = 25 = 10 
or 15 units above 74 or 75 to 89. Scores 
in this interval are marked RB. Scores 
above 1.30, or 89, are marked A. 

If it were known that the mean of this 
| group's scores was -.2 standard deviation 
below the mean of a normal group, the mean 
used should have been increased by .2 x 19.5 
= 3.9 = approximately 4. This would re- 
quire that the mean be raised from 65 to 
69 and all marks computed from 69 as the 
mean. Except in the case of fairly small 
groups or special schools with selective 

| factors, the group mean approaches that of 
a normal group rather closely. 

Note that the percentages at the vari- 
ous marks do not agree with the determined 
percentages of a normal distribution. There 
are l2 percent A's rather than 10 percent, 
32 percent R's rather than 20 percent, 

20 percent C's rather than 40 percent, 

28 percent D's rather than 20 percent, 

and 8 percent F's rather than 10 percent. 
It should be noted that no one needs to fail 
by this system of distributing marks. When 
no member of the class falls lower than 
-1.3 standard deviations from the mean, 
there are no failures. 

The data in the table on the following 
page show two injustices of normal curve 
markine. 

(1) In the two schools, the exception- 
al (according to the intelligence scores) 
group in one school has received the same 
average mark as the average (according to 
the intelligence scores) group in the other. 
There is large divergence in the persomel 
of these two schools as shown by the aver- 
age intelligence of the two groups in the 
trigonometry classes. Both of these groups 


a) 
e 
i 


60 = .5 





l. A similer edjustment of the meen could be made when using the normel curve method. 
2. Since 65 is first whole number shove the actual mean, 64.72, 75 is not included. 
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EXAMPLES OF THE INJUSTICES PROVIDED BY THE NORMAL 
CURVE DISTRIBUTION OF MARKS AND THEIR CORRECTION 
BY THE STANDARD DEVIATION DISTRIBUTION 


The data in this table result from the application of 
the normal curve marking in Trigonometry, an elective 
course, in two city schools where the average school mark 
for entrance to the trigonometry class was 85. 











sa ARRCAEREERR RENEE ' 
| werk Mark 
Pupil Av.Sch.*! in | Pupil! Av.Seh.”| in 
Wo. 1.0.2) mark |trig.} Wo. |1.¢.4/ mark | Trig. 
t t + Eee 
1 | 156| 95 A 1 | 106 a5 c 
2 120 95 i 2 | 19 87 Cc 
s | 185} 9s | ¢ s | 146 95 A 
a | 126 87 | C 4 | 107 87 D 
s | ige| 8s | p s | in| 80° F 
6 | 185] 9 A 6 | 110 85 D 
7 | ne} 9% c 7 | 107 87 c 
s | 14) 90 B 8 | 115 95 B 
9 |i) 90 : 9 | ne 90 B 
10 | 187| 87 D 10 | 107 90 Cc 
i | lez 85 c uu | ile 90 B 
12 | 126 95 A lz | le 92 c 
is | 124 87 c 13 | 106 | 87 c 
14 | 1se| 90 R u |iz| 9 c 
L Lig 85 D | 
ie | 1z9| 87 c 
17 | 129] 85 c | 
is | 126 95 B 
19 | lee 87 c 
20 | 128 92 A 
21 | 188 92 c 1 
22 | 128 85 D 
23 | lg 87 D 
24 | 128 87 . 
25 | 127 90 c 
26 | 125| 92 f oF eee 
av. | ize{ oo | c [ust 90 [ c¢ 























are in the upper third according to the 
achievement in their respective schools. 
There is a difference of 14 points in the 
average I.Q, and very little overlapping of 
individual I.Q.'s between the groups. This 
situation offers little assurance to the 
colleges who welcome the upper third and 
exclude the lower third. 

(2) In the larger class there is lit- 
tle indication that any pupil should fail. 
One pupil with a general average of 90 
fails trigonometry. Ridiculous! 

The following computations will show 
how these pupils should have been marked. 


A different group in a different school was 


1. Same test used. 
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used because all marking had followed the 
normal curve method and the scores from 
which the marks were derived were available. 


THE MARKS BY THE STANDARD DEVIATION 
METHOD ARE OFFERED FOR COMPARISON 














—-|--- ——- ~ ——— 
| C-sc. Av. | | Stand. 
\Normel | of tests in | | Devi- 

| Curve course from ation 
Pupil | |Av.Sch. Course | which mark | | Course 


| | 
No. |I.Q.| Marks Mark was taken | 4 d* | Marks 


3 | 
| 
| | 


-$ 


70 20 | 400 








1 | 148| Be A A 

2 | 126| B+ c 46 4/16| B 

5 | mt . B 60 10 |100| A 

4 | 121! A A 65 | 15 | 225 A 

5 119; B D 59 | -11 | 121 c 

€ rae] = > F | 86 |=1e fase] —c 

7 117! A E 63 | 18 | 169 A 

8 116; B+ Cc | 42 | -—8 | 64 Cc 

9 | 115; Bt D =i 39 | -11 l2l c 

10 | lla) A- c_ | 45 | -5/| 25| 8B 

n | 13) B c 51 aT ae @ 

12 | 112; B+ 5 58 8| 64| A 

13 | 111; Bt c 45 -5 | 25/ B 

14 | 111/| B c 46 ~4/|16| B 

15 | 110| B c 44 ~/| s6/ C 

ie |109| BB | 57. | 7?) 49] 2 

A i7| » | c | _ ee wk B+ 
Normal] illo} c | c | T 





Since the normal curve method of mark- 
ing had been used to obtain the marks in 
the required courses from which the average 
school mark was computed, the group's aver- 
age mark, B+, was 1.5 letters above the C 
average or in the upper one-half of the 
20 percent marked B. The + marks were 
not given on the records but are the result 
of averaging the recorded marks. 

Since this group's average mark, Br, 
places them above the middle of the 20 per- 
cent of the scores between .5 and 1.3 
standard deviations above the mean, their 
achievement is approximately .2 standard 
deviation less than 1.3c from the mean or 
l.lo above the mean.* Correcting their 
marks according to the mean of their achieve 
ment would necessitate placing their mean 
at 1.lo above the mean. Since their aver- 
age course mark is SO, this SO is l.lo 





2. Av. Sch. Marks are taken in required subjects where at least 100 pupils were entered under the same teacher. 

5. No reason given for this pupil's mark being below that required for entrance. 

4. There are 1.50 - .50 or .80 in the 20 percent of this normal curve where the mark 5 was given. Of this .80, the up- 
per .40, is the B+. The middle of this upper .40 is .2o0 from the upper limit or 1.50. Therefore, 1.50 - .20 or 1.10 
above the mean is the average achievement of this group. The group's mean is therefore 1.10 or 1.1 x 10 = 11 points 


above the mean of a normal group. 
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above the mean for a normal group. The 
standard deviation of their course numerical 
marks is 10. Therefore their mean of 50 is 
1.1 x 10 or 11 above the ordinary mean or 
the ordinary mean is 50 - 1l or 39. One 
score is below this corrected average but 
within .50 of same and therefore receives 
aC. Including .5 of 10, (the standard de- 
viation) or 5 scores above the mean shows 
39, 40, 41, 42, 43, and 44 deserving of 
marks of C, proceeding to 1.30 or to 1.3 

of 10 = 13 points above the mean (13 - 5 

= 8 more scores), 45, 46, 47, 48, 49, 50, 
and 51 should receive marks of 8B, and 
those above score 51, marks of A. The 
marks designated are provided in the last 
solumn of the above table and differ widely 
from those provided by the normal curve 
method. 


SUMMARY AND CONCLUSIONS 


(1) Every precaution should be taken 
to make the school mark a valid estimte of 
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the achievement represented, 

(2) The common normal curve method of 
distributing marks results in injustices 
except for normal groups. Many school 
groups are not normal groups. Therefore, 
both the mean score or general group 
achievement and the individual scores are 
improperly placed by the normal curve meth- 
od of distribution. 

(3) The standard deviation method of 
distribution of marks provides for the cor- 
rect distribution which the determined per- 
centages in the normal curve method of dis- 
tribution denies. A correction can be ap- 
plied to the mean which will take care of 
the group's difference from that of a nor- 
mal group. Failure is not necessitated by 
this method of distributing marks. Actual 
application of the two methods shows the 
superiority of the standard deviation 
method in that the marks provided are much 
more in agreement with the general achieve- 
ment record. 
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THE PRACTICAL STAT 


Charles 
Industrial 
Madison, 


The purpose of this paper is to report 
a three-year prediction experiment, making 
particular reference to the practical sta- 
tistical problems which were involved. Of 
necessity the experiment was of an explora- 
tory nature. No similar study has been pre- 
sented in the literature. The attention of 
previous writers has been directed to tests 
or other predictive measures, and their rel- 
ative value and how they might be used. Or 
by a study of individual or group case his- 
tories, they point out the psychological 
values of prediction. But the actual me- 
chanical processes of computation, predic- 
tion, and evaluating predictions statisti- 
cally have been somewhat neglected. This 
report will be concerned only with the sta- 
tistical aspects of a practical prediction 
situation. 

The subjects of the experiment were 
731 students enrolled in College Algebra at 
the University of Wisconsin, 279 who com- 
pleted the semester course in 1932, 219 in 
1933, and 233 in 1934. The object was to 
predict as well as possible the grades 
these students would receive. The grades 
were given as letters, A, B, C, D, and F, 
to which, for purposes of arithmetical com- 
putation were assigned arbitrarily the nu- 
merical values 4, 3, 2, 1, and O, respec- 
tively. The question of the accuracy of 
these grades, or what they actually meas- 
ured, is of no consequence to this experi- 
ment. 


To predict these grades there were 
available: 1) the rank in class in high 
school, expressed as a percentage; 2) the 
percentile rank on a psychological examina- 
tion; and 3) the numerical mark (range 0 
- 100) on an algebra placement examination. 
The first two were part of the record for 
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ISTICS OF PREDICTION 
by 

H. West 

Commission 
Wisconsin 


each student upon entrance, and the place- 
ment examination was given, at the begin- 
ning of the term, to those enrolled in Col- 
lege Algebra. To avoid an excessive burden 
of computation, the predictive measures 
were reduced to one digit, by dropping the 
unit's place, and recording all test marks 
or percentiles in the range from 0 to 9, 

The grades for the 1932 group could 
not, in the natural course of events, be 
predicted. The data of the first year 
formed the basis for derivine the first pre- 
diction formula, to be used in predicting 
| grades in subsequent years. This is an es- 
| sential part of prediction work, but until 

the formla has been applied, no prediction 
| has been accomplished. Analysis of the 
| 1932 data showed that the psychological ex- 

amination percentile rank was not of suffi- 
_ cient value in addition to the other two 
measures to warrant its inclusion in a for- 
mula. Seldom in educational prognosis does 
the use of more than two predictive meas- 
ures yield a satisfactory return for the 
additional labor involved. 

Essential data are summarized in 
Table I. The formula was derived using 
three variables,--the crades, referred to 
by the subscript "1"; placement examination 
marks, referred to by the subscript "2"; 
and the hich school percentile rank, re- 
ferred to by the subscript "3". Although 
the arithmetic was simplified wherever pos- 
sible, the computations were made with 
great care. Many steps are involved in the 
process, several of them subtractions, and 
in taking these steps some significant fig- 
ures are likely to be lost. For this rea- 
son, two variations in the method of com 
puting a regression coefficient may lead to 
results which do not agree exactly. To avoid 
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this difficulty, the writer began with aver-| 
,ces computed to six significant ficures, 
in order to arrive at regression coeffi- 

-fents correct in three places. Moreover, 
,11 computations were duplicated in order 
to make the results as accurate as possible. | 


u 
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adopted, and the student was reported as 
13th in a class of 50, making his percentile 


| rank 74, which would have been recorded as 7, 


These values, substituted in the formula, 
rcive the predicted grade as 2.23. To sim- 
plify the use of the formula for predicting 


TABLE I 


SUMMARIZED DATA FOR COMPUTATION 
FORMULAS AND STANDARD ERRORS 


OF PREDICTION 
OF ESTIMATE 








~T 


1932 
Data 


MEANS ceecsecces 


Standard Deviations Oz 


O3 


Tie 
Tis 
Te3 


Intercorrelations .....e. 


Regression Coefficients: 
Bie.s 


a) Standard Form .... 
Bis.2 


Die.s 


Dis.e 
c 


b) Score Form .ccccee 


Number of Subjects ...... N 


Standard Errors of 
Estimate wcccccescees O1,203 


Regression Coefficients of 
Modified Formulas: 
_ 518 
. 210 
-1.12 


a) To Predict 1933 Die.s 
GradeS .eccccoe Dig,e2 
c 


-290 
2252 
-1.15 


b) To Predict 1934 Die.s 
Grades wecceeee Digee 
c 


1932+1953 
Combined 
Data 


1933 
Data 


1944 
Data 


2.16438 
6.10502 
6.26484 


2.15060 
6.11245 
6.29116 


2.15880 
6.02146 
6.62661 


1.33119 
1.98232 
2.54556 


1.54561 
1.68724 
2.55729 


1.51535 
2.17368 
2.29995 


- 68754 
51852 
-46996 


- 65466 
~53754 
-40242 


- 65028 
52891 
-31924 


309 
-197 
-1.02 


349 
145 
-0.90 





The formula derived gave the predicted 
gerade in terms of the placement examination 
mark and the high school rank as 


Xi ~446X 2 + ~208X 5 - 1.90. 


Suppose, for example, the placement mark 
were 67, or 6 in the abbreviated form 





1933 grades, a table was made, showing the 
predicted grade for each of the hundred 
possible pairs of values of X, and X,. Then 
all predictions could be quickly read from 
this table. Such a table would be cumber- 
some if scores were used on the whole range 
of 100 points. In this case, a nomogram or 
prediction chart could easily be constructed 











—— 
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to save the labor of computing each value 
arithmetically. In the present case, 4 
further simplication was made to lighten 
the heavy burden of computation. Predicted 
grades in the table were rounded to the 
nearest .4. Thus there were only 15 dif- 
ferent values of the predicted grades, mak- 
ine easier the construction of scatter di- 
agrams of predicted and actual grades and 
the computation of the standard error of 
estimate. 

The standard error of estimate is com- 
puted from paired values of predicted and 
actual grades as follows: subtract pre- 


| 


| 


dicted from actual grades; square these dif-!| 


ferences; add squared differences; divide 
this sum by the number in the group to give 
the average squared difference. The square 
root of this average is the standard error 
of estimate. The value computed in connec- 
tion with the derivation of a formula is 
never computed in this way, since no pre- 
dictions have been made. The condition 
which roverns the choice of the regression 
coefficients is that the standard error of 
estimate (or more properly, its square), be 
reduced to a minimum, The mathematics 
which accomplishes this will readily give 
also the minimum value. One formula is 
Or.29 - 03 (1 = Biel, i Bss.2 Pials 
The value obtained in this fashion for 1932 
was .963, 

Why compute this value if it does not 
refer to any predictions which have been 
made? In this, as in much statistical 
work, the assumption is that successive 
groups can be treated as random samples 
from a hypothetical "population" or “uni- 
verse" which possesses constant character- 
istics. This condition may not always be 
met, especially in educational data, where 
test editions vary widely, and personal 
elements enter. But, working on this as- 


sumption, the value .963, obtained from the 


1932 data, is the basis for the only avail- 
able estimate of the "true" or "population" 
value, and may be called the "expected" 
value. 


| 30.8 percent. With the relations as they 


| 
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After 1933 grades were predicted the 
standard error of estimate was computed, as 
suggested above, from the deviations of ac- 
tual grades from predicted grades. It is 
this value which alone deserves the name 
"standard error of estimate”, as it is the 
only one which gives an expression of the 
accuracy of actual predictions. The value 
obtained was .957. As an isolated statis- 
tic this value is of practical significance, 
indicating the size of the errors which have 
been made in the predictions. 

Statistically, it is of further inter- 


| est when compared with the standard devia- 


tion of the 1933 grades, 1.331. The ob- 
tained value, .957, is 71.9 percent of it, 
so that the predictive efficiency, 100.0 
minus 71.9, was 28.1 percent. In other 
words, the use of the formula was 28.1 per- 
cent "better than a guess." "Guessing" 
merely means predicting everybody at the 
average, which is thereby not guessing, but 
the best estimate that could be made, know- 
ing only the 1932 grades. In the latter 
case, all would have to be predicted at 
2.14, the 1932 mean, and 1.331 would be the 
standard error of estimate. The fact that 
the 1933 mean was 2.16, or .02 more, would 
increase the error by less than .O00l, 
though a marked change in the mean would 
cause a more noticeable disturbance. 

But another comparison could well be 
made. When the 1933 data were used to com- 
pute a formula, which was thereby the one 
which would best predict the 1933 grades, 


| the standard error of estimate was found to 


be .921. This smaller figure was due in 
part to the smaller standard deviation of 
1933 grades, but also to a closer relation 
between predictive measures and grades. In 
order to obtain a smaller value than .921, 
measures bearing closer relationship to 
grades would have been required. The value 
-921 indicates a predictive efficiency of 


were, this value may be considered as the 
limit of efficiency of prediction. 

Could not the predictions have been 
improved by some modification of the for- 
mula which could have been made before the 





1. Correction for the number of degrees of freedom used in fitting, which should be used in more exact work, has been 
omitted here. The correction would increase the "expected" values by less than .5 percent. 
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vrades being predicted were known? To pre- 
4ict 1933 grades with 30.8 percent effi- 
siency, 1933 data would have to be used 
throughout the computation of the formula. 
Though complete 1933 data were not available 
until the grades were known, the means and 
standard deviations of the predictive meas- 
ures were known, and might be used in place 
of the corresponding 1932 data in deriving 
the formula. 
efficients are given in the “standard” form. 
To reduce them to the “score” form, the 
formulas are: 


Oy 
Dis.s = —APrzs 
C2 
0; 
Dis.2 Bis.2 
O3 
.? a.* Die.s Me soa Dis.2 M 36 


The modification consists merely of using 
1933 values of o2, O3, M2, and m, instead 
of those for 1932. For example, f,.2,; was 
.4646, and the regular form for b,,,, was 


1.3567 
1.4131 





x 4646, or 446. 


To obtain the modified form, 1.4131 was re- 
placed by 1.9823, giving for b,,., the value 
318, 

This modification might be expected to 
correct errors which would enter in case, 
for example, the new edition of the place- 
ment examination were easier, with resulting 
higher marks. An unmodified formula would 
then predict grades too high, whereas the 
modification would at least give the assur- 
ance that the mean of the predicted grades 
would be the same as the 1932 mean. The 
assumption is that average grades are more 
stable than the averages of the predictive 
measures, Errors to be corrected, such as 
that arising from a change in the difficulty 
of a test, are of a constant nature. in 
small groups where the sampling variability 
begins to assume the same proportions as the 
constant error, the modification could hard- 
ly be expected to be of value. Even in 
small groups, however, it may be found by 
experience that teachers do not vary the 
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In Table I, the regression co- | 
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average grades in full proportion to the 
variations in ability which do occur. Wher- 
ever this is true, undesirable as the sit- 


| uation may be, the condition required for 


the use of the proposed modification is 
fulfilled. 
The modified form of the 1932 formula 


was 
X, = .318X, + .210X, = 1.15. 
A table, similar to the one used previously, 


was made, and the 1933 grades were again 
predicted. In this case the standard error 


of estimate was .947, a figure slightly 


| follows: 








less than the .957 obtained with the first 
formula, and yielding 28.9 percent effi- 
ciency, compared with the 28.1 percent ob- 
tained previously. It would appear that 
this modification was worth while. 

To test the innovation further, the ex- 
periment was continued to the 1934 grades, 
which were predicted from the regular and 
modified formulas derived from both the 
1932 and 1933 sets of data, using tables 
giving the predictions to the nearest .4 as 
before. The standard errors of estimate, 
which are summarized in Table II, were as 





1932 formula, regular form ..... .936 
modified form .... .916 
1933 formula, regular form ..... .908 
modified form .... .913. 

Again the modification of the 1932 formula 


showed an advantage over the regular form. 
The slight disadvantage shown in the case 
of the modified 1933 formula was given fur- 
ther study. An examination of the table 
showed that the rounding to the nearest .4 
may have been especially unfavorable in 
this case. To investigate the point, and 
at the same time to show how much error was 
incurred by the rounding, both 1933 tables 
were recomputed to the nearest .1, and the 
predictions repeated. The standard errors 
of estimate were found to be: 


913 
- 907, 


1933 formula, regular form ..... 
modified form .... 
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The difference is still small, but is now 
in favor of the modified formula. The in- 
dication is, moreover, that the error in- 
troduced by the rounding is a relatively 
small one in comparison to the advantage 
show, for example, by the modified 1932 
formula over the regular one. 
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and modified forms. Tables were computed 
as before, and the 19%4 grades were again 
predicted, with the following standard er- 
rors of estimate: 


Combined formula, regular form... .923 
modified form .._ .898, 


TABLE II 


STANDARD DEVIATIONS OF GRADES COMPARED WITH 
STANDARD ERRORS OF ESTIMATE FOR ALL PREDICTIONS: 
EXPECTED, OBSERVED, AND MINIMUM VALUES 





Standard 
beviation 
of Grades 
Being 
Prediction Predicted 
1935 grades from 1952 
ForMula ceccccccccceces 1.331 
1934 grades from 1932 
TOPMMALA ceccecce ecccccce 1.315 
1934 grades from 1933 
formula ccccccece eececce 1.315 
1954 grades from 1933 
formula (nearest .1l) .. 1.515 
1934 grades from combined 
1932 + 1933 formula ... 1.315 


4 - i 


Standard Errors of Estimate 


. Observed 
Regular Modified 
Expected Formula Formula Minimum 
- 965 -957 947 921 
963 936 -916 894 
921 .908 913 - 894 
e921 915 907 B94 


954 923 898 894 





It will be noted that all of the for- 
mulas yielded a smaller standard error of 


estimate than that obtained in the course of 


computing them, i.e., the “expected” values 
.963 for 1932 and .921 for 1933. But the 
complete 1934 data yield a standard error 
of estimate of only .894, a figure again 
smaller than for previous years, but not 


Thus the regular form was 29.8 percent ef- 


| ficient, and the modified form, 31.7 per- 


surprising, because of the size of the stand 


ard deviation of the grades, which was 
1.315, also the smallest for three years. 
The minimum figure, .894, represents a imax- 
imum predictive efficiency of 32.0 percent. 
The regular 1932 formula yielded 28.8 per- 
cent efficiency, while the modification in- 
creased that to 30.3 percent. The various 
forms of the 1933 formula gave 30.6 to 
31.1 percent efficiency, but there was no 
way of knowing before the predictions were 
made that it would be superior to the 1922 
formula. 

A further attempt was made to secure a 


more efficient prediction of the 1934 grades. 


Nata from 1932 and 1933 were pooled, and a 
single formula derived, in both the regular 


cent, compared with the maximum of 32.0. 


| Unfortunately, the investigation does not 


reach far enough to give assurance that the 
combination will prove superior in most 
cases, but such is very likely to be the 
case. It certainly cannot be expected that 
a formula derived from data of a sincle 
group will continue, year after year, to 
yield predictions near maximum efficiency, 


| without some modification. 


| 
| 





An examination of the data for the 
three years, as civen in Table I, does not 
warrant the abandonment of the assumption 
of random sampling. Except for o,, none of 
the means or standard deviations vary more 
than would be expected by reason of chance, 
or “sampling”, variation. The variations 
in o, alone are too large to be explained 
in this way. Since this variation is in 
one of the predictive measures, its effect 
can be corrected, in part at least by the 
modification which has been used. 
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Charles 
CONCLUSIONS 


1) Very good predictions were obtained 
in spite of reducing the predictive meas- 
ures to one digit. 

2) There was some evidence to show that 
rounding the predicted grades so that there 
would be only 15 or less different values 
iid not cause serious disturbance to the 
value of the predictions. 

3) Variations in most of the data from 
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tions of random sampling. 

4) The modification of prediction for- 
mulas, by substituting the means and stand- 
ard deviations of the predictive measures 
for the new group being predicted was found 
to give worth-while improvement in predic- 
tive efficiency. 

5) Judging from the single case tested, 
an apparent advantage in favor of pooling 
data from successive groups merits at least 
} some consideration. 





year to year were not large enough, for sam- 


ples of this size, to contradict the condi- 


set # He He 
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PREDICTING THE RETURNS FROM QUESTIONNAIRES: 
A STUDY IN THE UTILIZATION OF QUALITATIVE DATA 
by 
Herbert A. Toops 
Ohio State University 


Through the kindness of Dr. William G. 
Carr of the National Education Association 
the writer was supplied with the data re- 
garding the "traits" (tests) and the per- 
centage return (criterion) on some 135 dif- 
ferent questionnaire investigations which 
were collected by the N.E.A. in their ef- 
fort to formulate principles of question- 
naire construction and promulgation. In 
the published report? certain principles of 
questionnaire construction, in order to se- 
cure a maximal return from the question- 
naires distributed, were elucidated. These 
arguments were al] based upon the observed 


zero-order correlation coefficients or Tote | 
N.E.A. 


er scatter-diagrams between the variables in 
question (X) and the percentage return (Y). 
Accordingly, it readily occurs to one that 
perhaps if multiple regression equations 
were employed some of the conclusions of 
that report might be altered considerably. 
But since many of the “variables” regarding 
the questionnaires were qualitative vari- 
ables--e.g¢., form of reproduction of ques- 
tionnaire: "“mimeographed", “printed”, 
"typewritten",--such variables could not be 
employed in the customary multiple regres- 
sion procedures. The author, being much in- 
terested in determining what were “the most 
important considerations in qu.stiomatire 
formulation in order to get a maximal re- 
turn" for the purposes of a forthcoming 
reference book on questionnaire construc- 
tion,” appealed to Dr. Carr for a copy of 
the original data, as above stated, in the 
hope that a method of handling these quali- 
tative variables could be worked out. 

The nature of the data may be inferred 








from Table I, in which the "answers" for 
questionnaires Nos. 012 and 014 are given, 
together with the corresponding originally 
coded scores which were punched into Hol- 
lerith cards for the production of the 
"validity correlation plots." In this cod- 
ing process each different answer was given 
its own code number; certain rough classi- 
fications, or judgments similarly cate- 
gorized, having already been ascribed to 
certain ones of the original data by the 
N.E.A. as for example in variables l, 5, 
10, 11, 12, and 13. Three of the variables 
were derived by the writer, namely variables 
4, 6, 16, from the data supplied by the 


From the resulting plotted validity 
scatter-diagrams it was obvious that, tak- 
ing the scores as originally coded, many of 
the validities were practically zero or 
even negative. There were three obvious 
difficulties:- (1) An alphabetical order 
of the categories of a qualitative variable 
is not necessarily the correct sequential 
order of the categories on a "quantitative 
scale"; (2) some of the quantitative vari- 
ables had evident tendencies toward curvi- 
linearity of regression; (3) the com- 
pounded-answer variables yielded a "valid- 
ity plot" of meaningless complexity. 

Both of the first two difficulties may 
be readily overcome by so transmiting the 
scores as to rectify the regression of Y on 
X. We canrectify the regression line of a 
criterion variable upon a test variable by 
the following simple principle:- To each 
category (or quantitative score) of a vari- 
able ascribe a transmuted score which is 











1. N.E.A. Research Division—-The Questionmaire. Research Bulletin of the National Education Association, Vol. 8, Mo. l, 


Jan. 1950, 51 pp. 


2. Toops, Herbert A. Questionnaires, Standard Codes and Hollerith Machines. To be published shortly. 
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TABLE I 
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THE DATA RECORDED FOR TWO QUESTIONNAIRES, NOS. 012 AND 014 





































































































—— ee ne T —— ——— a: ca 
| | Corresponding 
Originally 
| Coded Scores 
"Answers" for | for Correlation- 
Vartable Number and Name Questionnaires No.:- | Plotting 
$$ - —_—__—___—__—_—__+ : . 
o12 014 ole | ol4 
£ ana | 
Director of pier die 
1 Source of Questionnaire | Beeentete) Ante Research of 2 | 4 
ee arent 9D me: 7 a : 2 the N.E.A. | 
State of Origin T | 
2 (Address of Qquestion- | Connecticut Washington, 04 06 
naire Sender) | D. C. | 
_S Sex of Author | Male Male 2 | 2 
Is author's name in- | 
4 cluded in "American | No Yes l 2 
_Leaders in Education"? | 4 | | 
Teachers' } 
Subject classification Vocational Salaries and 07 | ol 
of the questionnaire Education Professional 
Status = 
Length of time in tenths | | | 
of a year since the be- 
6 ginning of the school 3 5 3 Ci 5 
year, when the question- 
naire was issued + | 
ep age os | Mimeograpnea | Printed = | 1 | 
Number of pages in 
the questionnaire SETOR Saee ee salbete a tobi wes — 
Number of items | 6 33 | 00s ~4=|~—«(o33 
5 ea ccamineditedl ate. STS Red Pie) + 
Types of question asked si Yes and No; | 
10 (including combination Short answer check, and 04 | 07 
types) CC C—“‘(‘C(#*: OGShort: answer | ental 
‘Was space for | | : 
EET TE OOD T.. ETR OES. IE. _ 
Types of material re- Statistics: 
12 quested (including com- Other Object. a | 21 | 04 
bination types) __ Data, Opinion 2 i | 
3 Sanam te of Data | Possible Easy ™ = i a 
Number of copies of a ‘ 
14 questionnaires issued 55 at? | == [ee Te a 
Was a tabulation of re- “ ' 
15 sults furnished the N.E.A.? hh oh ae, ed St ad 
Average number of items T . 
per page of the ques- { 
16 tionnaire (Variable 9 ¢ 8 8 08 08 
__ Variable 8) » ae ee a: L 
Number of "usable" 
as replies received oe 15352 i | cose 1552 — 
coontuen dl of replies ' 
received (The Criterion 
{(Variable 17) x 100] ¢ 56.2 55.8 | ans | oss 
Variable 14. 
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proportional in size to the average depend- 


ent, (and associated) crite: criterion score of 


the category (or quantitative score) in 
question. It will be noted that this prin- 
ciple maximalizes the zero-order validity 
coefficients; and, since some of the cate- 
gories have few frequencies, the reliabil- 
ity of this transmutation is low; and, fi- 
nally, the validities noted are, correspond- 
ingly, undoubtedly too optimistic. 
Accordingly, in all qualitative vari- 
ables, and in all quantitative ones where 
the regressions were markedly curvilinear, 
the means of the columns of the scatter-di- 


agrams were computed. 


The Differences Technique. (For simple- 
categoried variables and curvilinear re- 
gressions.) 

On the qualitative variables, the re- 
sults were typified by the following:- 
Variable 7, 

Form of Reproduction Mean Criterion Score 


of Questionnaire (Percents of Returns) 
Mimeographed 68.6 
Printed: 57.4 
Typed 74.1 


It is clear that in this variable the cate- 
gories do not appear in the correct se- 
quence of the quantitative variable to which 
they may be reduced. By rearrangement we 
have:- 


Form of Reproduction Average Percents 


of Questionnaire of Returns 
Printed 57.4 
Mimeographed 68.6 
Typed 74.1 


Now it is clear that, by the above 
principle, we may rectify perfectly the 
above regression by ascribing to the three 
categories X' scores which are in exact 
proportion to the three averages, namely 


xt 
Printed 57.4 
Mimeographed 68.6 
Typed 74.1 
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However, these numbers are clumsy to 
deal with, and we have the further principle 
in coding that: Any score X' can be trans- 


formed by the equation X' = a + bX" (pro- 
vided b be positive) without changing either 
the marnitude or the sign of the resulting 
intercorrelation coefficients of the vari- 
able X' in question. In other words we can 
multiply all of the above scores by any 
positive constant b (either mixed, fraction- 
al or decimal) and then add any constant a 
(either positive or negative) without af- 
fecting the correlation coefficients of the 
variable in quest’ on in its relation to 
other variables. This process can be done 
in such a way that, by slight shifts in the 
scores assigned, small integral numbers 
will result, and the regression will be al- 
most maximalized. In the above case, for 
example, the work may be arranged as fol- 


lows :- 




















Printed Mimeographed Typed 
Average Percent of Re- 
turns (X') 57.4 68.6 74.1 
Difference in Returns +11.2 +5.5 


Difference Divided by 
a Common Denominator 
of the Difference 
(5.60) 

Coded Scores, if 
"Printed" be As- 1 3 4 
signed e Score of 1 


In the above, a = +51.8 
b = 5.6 
and, xX" = -51.8 + 5.6X! 
although, by the method (of differences) 
used, there is no need for the determination 
of the magnitudes of a and b? or for the 
statement of the transmutation equation. 


The Transmutation Table Technique. (An al- 
ternative method for simple-categoried 
variables and curvilinear regressions.) 

In the case of variable 2, the state 
of origin (Address of questionnaire sender ) 
we did the following:- 

1. Upon an outline map of the United 
States we copied the several percentage re- 
turns individually of all the questionnaires 
originating in each of the respective states. 
Seventeen of the states had none; ten others 











- It will be observed that where there are only two categories to a variable (e.g-, "Ho" and "Yes") these may always 


be coded 1 and 2 without further ado. 
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had 1, while twenty-one had from 2 to 21, | 
the last originating in the state of New 
York, Canada and the District of Columbia 
also had one each. | 
2. The returns by states were averaged; | 
whereupon it was noted that there are dis- | 
tinct regional tendencies, for example, very, 
low returns in the extreme west and very | 
hich in the northcentral and northeast. The | 
average return figure thus obtained for a | 
state was allowed to represent a state in 
all cases where there were two or more ques- | 
tionnaires per state; also in all cases of 
one per state unless that figure was more 
than 15 percent removed from the modal fig- | 
ure of that region when the modal figure of | 
the region was substituted instead, such 
correction being necessary only for two 
states.+ 
3. The resultant percentages were cod- | 
ed by transmutation as shown in Table II. 


TABLE II 


Average Percentage : 
of Return Per State — 








52 - 53 1 

54 - 55 2 

56 - 57 3 
58 - 59 4 
60 - 61 5 
62 - 63 6 
64 - 65 7 
66 - 67 8 
68 - 69 93 
70 - 71 10 

72 - %3 11 

74 - 75 12 

76 - 77 13 

78 - 79 14 

80 - 81 15 

82 - 83 16 

84 - 85 17 


The final coding is shown in Table III. It 
will be noted that some 17 states are not 
represented at all in the list; and, for 
these states scores will need to be supplied 
arbitrarily, for the present, in using our 
rating scale proposed below in areas not 
covered by the original data of this study. 
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TABLE III 


A RELATIVE INDEX OF PROPORTIONATE RETURNS TO 
QUESTIONNAIRES TO BE EXPECTED, AS BASED 
UPON THE STATE OF ORIGIN OF THE SENDER 





x" 
Alabama 17 
California 2 
Colorado 8 
Connecticut 9 
Delaware 15 
Dist. of Columbia 1 
Georgia 17 
Illinois 5 
Indiana 1 
Iowa 8 
Kansas 1l 
Kentucky 8 
Maryland 10 
Massachusetts 9 
Michigan 14 
Minnesota i 
Missouri 8 
Montana 5 
Nebraska 7 
New Hampshire 9 
New Jersey 15 
New York 8 
North Carolina 3 
Ohio 9 
Oklahoma 8 
Oregon 2 
Pennsylvania 7 
Tennessee 16 
Texas 7 
Virginia 7 
Washington 2 
Wisconsin 14 
Canada 10 





The above process is an alternative to 
the former and, like the former, makes a de- 
termination of the macnitudes of a and b un- 
necessary. 


Treatment of "Answer Patterns." 

In the case of variables 12 and 10, 
where the "answers" to the variables are 
compounds or patterns, each pattern was in- 
dividually coded but without yielding mean- 
ingful results when plotted as validity ta- 
bles. 

The original categories of "answer" to 
Variable 12, for example, were:- 








l. It will be noted that this is done in sccord with the notion above that low-frequency-determined averages are less 
reliable than high-frequency-determined averages. The map procedure also gives a method of supplying scores for the 
seventeen states which had no questionnaire originating therein; and, es such, is applicable to locations only. 
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Addend Sategory TABLE IV 
1 Statistics 
2 Lists of names (schools, etc.) PERCESTAGE OF RETURNS OB VARIABLE 12 
; 4 Other objective data (The number in parentheses in the upper left-hand 
8 Discussion of subjective opinion corner of @ compartment is the number of questionnaires 
16 Opinion involved; the figure in the body is the average per- 
centage return; and the figure in the triangle is the 
If now one will add the addends for any pat- | © ™usber finally decided upon. ) ; 
tern of “response” one will arrive at a | Sete 4 Wumber of Subjective Types on 
unique code number for that pattern. For | objective }— ——— — 
the answer-pattern "statistics, other ob- Types | 0 1 | 2 
jective data, and opinion” for example, the | Pe | ee tom a Pee: 
code number is 21. The resulting validity 0 | ill 65.5 | ra es 
table [of 31 code numbers (X) against per- | | /\| 10 
cent return (Y)] was, as might have been ex- | ———— —— t + ieallidmeeieee 
pected, meaningless. Clearly we have here a| | (42) (21) | (10) 
ease of plotting in sixfold space the pos- | lectle seed A 0.8 Ad| 6.5 A 
sible combinations of 5 variables of two ene - ae = __ a 
categories each--possession and non-posses- | | (25) | (10) | (6) 
sion--and notin the resulting surface where | é | 67.5 | 64.5 58.0 »/ 
; variable 0 is the dependent variable! Vari- | ca ear AN papa Ad! £1 
ous expedients were tried. Finally, it be- | (4) | (2) | (2) 
' came clear that the most important distinc- | 3 58.2 22.0 aie. 
| tion was between "objective" and "subjec- | Jd A), /1 
tive” replies and that the "complexity" of ae 
reply demanded had something to do with the x: 
matter. These distinctions could be made | 46.0 - 47.9 1 
by dividing the above categories into ob- 48.0 - 49.9 2 
i jective and subjective types as follows:- 50.0 - 51.9 3 
Objective types (3 types) 52.0 = 53.9 4 
, No objective data ° 
f! Statistics . 
H List of names (schools, etc.) ° 
i Other objective types | ; 
Subjective types (2 types) : 
No subjective data | 74.0 = 75.9 15 
Discussion of subjective opinion | 
Opinion | The result, it will be seen from the table, 
| Accordingly, a questionnaire might have any-| is a series of very satisfactory code num- 
where from 0 to 3 objective types of data | bers most of which are based on enough 
cases perhaps to be fairly reliable. In 


requested, and, concurrently, anywhere from 
j 0 to 2 subjective types. Table IV was con- 
i structed. If one will erect pins at the 
centre of the compartments of Table IV, 


such tables the reliability of the coding 
inheres somewhat in the consistency of the 
network of values assigned. 





proportional in height to the table entries In variable 10, after trying succes- 
therein he will note that a fairly smooth sively various combinations, the eventual 
doubly warped surface results, if we read- distinction seemed to turn first upon simp- 
just but one point, compartment 3-1 (Row 3; ly recorded answers (yes-and-no; and check- 
column 1). The resulting code numbers ing) as versus more complicated written an- 


(figures in the triangles of Table IV) were | swers (either short or long written answers, 
determined then from the following transme- | requiring more deliberation in determining 
tation table (a 4 being supplied arbitrarily| what to write down as the answer); and sec- 
for the percentace 22.0):- ond, upon the variety of answering 
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TABLE V 
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For Variable 


PERCENTAGE OF RETURNS ON VARIABLE 10 


(The mumber in parentheses in the upper left-hand 
corner of a compartment is the number of questionnaires 
involved; the figure in the body is the average per- 
centage return; and the figure in the triangle is the 


code number finally decided upon.) 


procedures demanded, complicated answers re- | cipients' "interest" 
-eiving' the better returns! 
Table V was evolved. 





| 





Complicated written answer 


'Complicated an- 














Simple swers (short and/ 
Answer | No complicated|or long written 
Required it answers answers) 
a i] 
(35) 
No s 
o simple 67 
answers Ad 
(2) (64) 
Yes-and-No | 58 ‘ 72 + 
(4) (6) 
Check 63 71 * 
Yes-and-No | (5) (21) 


and 
Check 
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All the variables having thus been 
coded and quantified, all intercorrela- 


tions, means and standard deviations were 


computed (Table VI, page 210). 
the multiple ratio technique was applied 
in order to determine, successively, the 


identity of the minimal 2, 3, 4 ... vari- 


To these 


able-composites which, optimally weighted, 


will maximally predict the return to be ex- 
pected of a questionnaire of a given pat- 
tern of (our test-measured) characteristics. 


See Table VII, page 211. 
The variable of highest zero-order 


validity is Variable 5, the subject-classi- 
fication of the questionnaire--whether the 
questionnaire deals with a topic which "in 


reneral is well replied to" by the persons 
questionnaired; and a measure of the re- 
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in the investigation-- 
|and this variable in its own right corre- 

| lated -4328 with the returns received. Em 
| Ploying one "test" only to predict the per- 
| centage return, we would use this one. 

If now we give standard scores in this 
| variable a weight of 1.0000, that one of all 
/the remaining sixteen variables which will 
raise the prediction most is Variable 14,-- 
|a variable of almost equal importance as re- 
| vealed by its negative weight of -.9796,-- 

| the number of copies of the questionnaire 

| issued, and results in a multiple coeffi- 


- | cient of .5609 for the two tests in combina- 


| tion, so weighted. As stated in the remarks, 
| this variable probably takes second place in 
the regression equation for the reason that 

| it is, perhaps, the best indirect and (in- 

| verse)measure we have of the amount of fol- 

| 1ow-up effort expended on the questionnaires. 
| Observation would indicate that very large 

| mailing lists are frequently if not common- 

| ly employed for the purpose of getting a 

"large enouch reply” without resorting to 

| follow-up procedures. In a recent business 
questionnaire, for example, over 200,000 
questionnaires were distributed in order to 

| obtain some 18 percent, or about 40,000 re- 
plies. Accordingly, this variable received 
a negative weight and boosts the multiple ra- 
tio coefficient markedly. No subsequent 
variable can add nearly so much. If we 

were to predict the returns by two varia- 
bles only we should employ these two, Vari- 





ables 5 and 14, for the purpose. The re- 
gression equation, 

X -_ 

| Xo - Moy (1.00) Xs 7 Ms . .9796 Rag = Me 


Co oO, 
has a validity of .5609 in predicting the 
expected returns, X,, to questionnaires for 
which these two items are known in advance. 
Variable 2 comes next, the state of 
origin of the questionnaire, which, with a 
weicht of .8861 yields in combination with 
the previous two--a three-variable scale-- 
a multiple ratio coefficient of .6245. 
(After the second test, the magnitude of 
the g-weight no longer is an index of the 


14 











neire?" 
lems. 


l. It will be remember that we are not here by this diesgrem enswering either the question "Are short or long ques- 
tions desirable in a questionnaire?" or "Are simple or complicated forms of answer response desirable in 6 question- 
With the variable quantified, we may hope that the regression snalysis will throw some light on these prob- 
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reiative worth of the variables, but worth 
must be inferred only from the increase in 
size of the multiple ratio coefficient. It 
will be remembered that all other variables 
yield lower coefficients when placed in the 
third place in our scale.] Accordingly, 
employing only Variables 5, 14, and 2 we 
can predict, by means of the proper equa- 
tion the returns on a questionnaire in edu- 
cation with considerable efficiency as in- 
dicated by the coefficient .6245. This 
last-added variable is highly important no 
doubt for the reason that the attitude of 
recipients towards answering questionnaires 
is a product of geographical location, as 
well as, no doubt, of excellence of tech- 
nique (including follow-up) employed in 
particular localities. The appropriate re- 
gression equation to use is 

To-Me = (1.00) Xe-By _ .9796 Kee , -8861 an 


° 5 0 a4 % 


Variable 13, availability of data 
comes next. As indicated by the footnote 
of Table VII, the negative weight here is 
a product of the method of coding. Clearly 
the result indicates that, the question- 
naire returns are dependent upon "the path 
of least effort" so far as the recipients 
are concerned. One can hardly hope to get 
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| unpaid "researchers" to do research for one. 


The next most important variable is 
Variable 10, the types of question asked. 
The §-weicht is positive. This means that 
any combination with hich X't-value receives 
better returns than any combination with 
low X'-values. Contrary to expectations 
the more complicated forms of response when 
employed in a questionnaire receive the 
higher return, other things (specifically 
"Reasonableness" having just previously 
been accounted for (Variable 13), the posi- 
tive weight possibly means that it is the 
"thinking" type of question which is “of 


interest" and “of value" to the recipient 





--or at least of such minimal value as to 
lead him to reply rather than to refrain 
from replyinc. It will also be noticed 
(Table V) that checking alone (and also as 
a score in the regression equation) yields 
a better return than the yes-no type of 
question. It is the writer's observations 
that many persons attempt to force the re- 
plies of "yes" and "no" to issues to which 
one may not, honestly, answer a categorical 
"yes" or "no." 

The sixth variable (No. 12) is Types 
of Material Requested. This is in part a 
plea for objectivity, on the part of 
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recipients, as versus subjectivity and 
opinion; but, even more so for simplicity 
of mode of answer response. If one must 
change frequently his mode of response, 
particularly if he has many alternations 
from fact-to-opinion-to-fact-to opinion, we 
micht guess that the recipient of a ques- 
tionnatre will become disgusted and throw 
the whole questionnaire in the waste basket. 
Such variables as Nos. 10 and 12, where pat- 
terns of response are the issue are of much 
greater difficulty to interpret than the 
simple responses of a simple-categoried va- 
riable such as Variable 7, the form of re- 
production (where it is clear, as shown in 
the footnote below, that typed question- 
naires get the greater response, the other 
things (of this investigation) being equal, 
because--and this is speculation—the typed 
questionnaires are sent to one's more inti- 
mate, more-professionally-inclined friends 
who have a greater personal obligation to 
reply.] One can give statistics, names of 
people, schools, etc., quickly and without 
thinking--if available at one's fingers’ 
ends. It takes time to make up one's mind 
on issues; some composing ability "to write 
it well", and some mental inertia to be 
overcome to get the thinkine process 
started, Besides, one's opinions are some- 
times valued for personal exploitation--for 
a proposed book or article, or for impress- 
ine a superior at the proper time (of an- 
nual promotions, for example) with one's 
worth,--and, accordingly, are not always to 
be had for the mere asking, since all too 
often it is thought that "to give away an 
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idea impoverishes me and is of doubtful 
value to you, since you do not know the con- 
ditions out of which it originated.” 

The next variable is the source of the 
questionnaire (No. 1) whether sent by a 
school superintendent, a college professor, 
a research bureau, etc. County departments 
of education get the best returns [we sus- 
pect some “moral” obligation to reply] 
while business firms (including publishers), 
associations and foundations of national 
scope and universities, colleges and normal 
schools, receive the least. It is easy to 
see that compulsion, moral duty to reply, 
intimate knowledge of and friendship with 
the sender are elements in the situation 
here. 

In view of the fact that these seven 
variables yield a multiple ratio coefficient 
of .7137 as the efficiency with which the 
percentage returns can be predicted by means 
of these seven most important variables, and 
in view of the fact that there is every in- 
dication that the remaining ten variables 
in all would not raise this figure to .73, 
we may consider this selection as adequate 
for the purpose of building a scale to pre- 
dict questionnaire returns.! 

If, in summary, we would seek to ab- 
breviate to a few phrases the seemingly in- 
portant elements in securing high returns-- 
going beyond our data somewhat in our at- 
tempt--we would be inclined to guess that 
the following, in approximately a descendince 
order of importance are the elements in get- 
ting a high return: 

1. Write the questionnaire on a topic 

















l. The next five variables in order to enter the composite, together with their multiple ratio coefficients and 


f-weights are:- 
Multiple ratio 
Variable Coefficients B-weight 
6 Length of time in tenths of a year since the beginning 
of the school year that the questionnaire was issued.* 7199 5360" 
9 Number of items questions requested - 7226 --2708 
4 Is author's name included in American Leaders in Education -7251 +2331 
7 Form of reproduction of questionnaire (printed = 1; mimeo- 
graphed = 3; typed = 4) - 7257 1184 
-7261 -.0948 


8 Number of pages in the questionnaire 


*The positive weight of the coded scores, so taken as to rectify the regression and give large coded scores to the 
low tenths, is the equivalent of « negative weight, meaning "the seven previous variables being equal, the earlier 
in the school year the questionnaires are sent the better." The coding table is not reproduced for lack of space. 
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in which the recipients are vitally inter- 
ested themselves in knowing the answer, and 
take pains to exploit this interest to the 
itmost. (Variable 5.) 

2. Send the questionnaire to those few 
people who, because of personal friendship 
and knowledge of your professional repute, 
will feel some personal obligation to re- 
ply. Exploit this by promises of the re- 
sults, and other means. 

3. Employ a vigorous follow-up tech- 
nique, devised to touch upon various motives 
in turn, as viewed from the recipients! an- 
gle. Do not be content to send 40,000 ques- 
tionnaires and receive a reply of as many 
nundreds. (Variable 14.) 

4, Use the best possible technique in 
writing your questions. 

5. Circulate your questionnaire in 
those portions of the country where "reply- 
ing is more than a courtesy, and approaches 
2 fixed habit." (Variable 2.) 

6. Don't tax the interest and effort of 
the recipient, but make it easy for him to 
reply. (Variable 13.) Remember that he is 
in your employ only by courtesy. 

7. Use objective unequivocal but 
"sensible" questions. (Variables 10 and 
12.) Do not avoid written answers; but be 


chary of “essay” answers. 

8. Employ advisedly such incidental 
pressures as “moral obligation to reply." 
(Variable 1.) 

9. Send your questionnaire early in the 
school year, before the pressure of other 
duties decreases its chances of receiving 


attention. (Variable 6.) 

10. Don't worry about the length of the 
questionnaire, (Variables 9 and 8) if all of 
the other rules have been followed faithful- 
ly, but remember that length may be a symp- 
tom of slovenly questionnaire technique and 
proceed accordingly. 

An examination of the questions con- 
tained in, or implied in, the analysis of 
the list of seventeen variables obtained on 
these questionnaires reveals the fact that 
perhaps some of the more important variables 
were not obtained for our analysis. In the 
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course of the study the conviction has 

grown that some of the following are, per- 
haps, more important than many of the “ques- 
tions asked" of questiomaire investigations 
in this study: 

1. The number of "follow-ups" employed. 

2. Was the promise-to-reply technique 
used?! 

3. Was the relationship of recipients 
to sender one of “moral duty to reply", 
@.g., such as of any school employee of a 
state in relation to the state director of 
research in education of the same state. 

4, Was the questionnaire anonymous? 

5. Was the investigation "“confiden- 
tial"? 

6. An index of the degree of familiar- 
ity, intimacy, friendliness and profession- 
al interest of the recipients in general 
with respect to the sender. 

7. Size of page used; perhaps area of 
the page used. 

8. Crowding of questions index; per- 
haps average number of questions per page, 
measuring pages in fractions rather than 
integers only, as herein employed. 

9. Were recipients promised "pay" for 
replying; cash, mention in report, extra 
copy of questionnaire, copy of the report, 
or combinations of the above? 

10. Was humor used? 

1l. Were illustrations used? 

12. An index of the average tendency to 
reply of the recipients to the question- 
naires received based upon the state in- 
dices-of-reply-tendency of the several 
states in which the recipients reside. 

13. An index (rating scale) of the ex- 
cellence of the technique employed in the 
questions used. 

At least three of the above variables, 
Nos. 6, 12 and 13, are indices requiring 
some research to devise the index, in ques- 
tion, to be used. With such availabie it 
is inevitable that the mltiple ratio coef- 
ficient of .73 obtained in this investiga- 
tion could be raised considerably. This in 
turn means that the poor questionnaire can 
be spotted ahead of its distribution; the 





l. Ome questionnaire of 1835 questions, of the original 156 in the N.E.A. investigation was discarded because this 
technique was employed and since the questionnaire did not seem to belong to "the universe" from which the remaining 


135 were sampled. 
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points on which it scores low [see the rat- 
ing scale below) may be remedied by appro- 
priate techniques, in some cases; and the 
proposed conditions of distribution may be 
altered with a view to securing a maximal 
return--in other words the whole process 
can be as well or better controlled than 
any known kind of campaign involving the 
cooperation of others for its success. 

The results of this study have been 
compiled in the form of a rating scale as 
shown below. This may be used as a means 
of predicting the percent of questionnaires 
that will be returned in a given case. 





Copy 


8 

2 be 

a & Credit | Credit 
3 3 Points | Here 


O| Constant, K, depending on the combina- 








tion of tests used when Questions | 
l and 2 only are employed| 2.70 
l, 2 and 3 only ® 7 |-34.06 
1, 2, 3 ond 4 " " -14.49 
l, 2, 3, 4 amd 5 " " ~42..52 | 
1, 2, 3, 4, 5 and 6 " " ~82.86 
| l, 2, 35, 4, 5, 6 and 7 S [798.56 | | 
L | (Variable 5) Subject Classification 
f the questionnaire 
B-weight = 1.0000, W,-weight = 
8.33007 
Classification of Questionnaire 
Li*0l*Teachers Saleries and Profes- 
sional Status 97.13 
8 02 Administration 70.64 
10 OS Health and Physical Education | 88.30 
1 O64 Instruction and Organization 
and Spec. Method 8.83 | 
6 05 Reading 52.98 
9 06 Finance | 79.47 
5 O7 Vocational education | 44.15 
8 08 Junior High Schools | 70.64 
12 09 Special Education |105.96 
7 10 Buildings | 61.81 | 
8 11 Textbooks end Supplies | 70.64 | 
9 12 Geography 79.47 
9 15 Student accounting 79.47 
8 14 Mathematics 70.64 
8 15 Secondary schools 70.64 
9 16 Music 79.47 
¥ 17 Elementary schools 79.47 
8 18 Kindergarten and Primary 70.64 
| 10 19 Libraries 88.50 
7 20 Supervision 61.81 
8 21 Counseling 70.64 
9 22 Adult Education 79.47 
ll 25 Rurel Education 97.15 
ll 24 Teecher Training 97.15 
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aS 
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1 | 9 25 Thrift education 
9 26 Theory and Principles of 
education 
13 27 Foreign Languages 
| 9 28 Home Economics 
12 29 Statistics 
9 350 Religious Education 
| 9 31 English 
9 32 Curriculum 
ll 53 Extra-curriculer activities 
| 9 34 Tests and Measurements 
8 35 Commercial Educetion 
9 36 Handwriting 


10 37 Character Educetion 


~+ 


2 


3 


(Variable 14) Number of copies of 
\uestionnaire issued 

B. = -.9796. We = -.0599038 
Rule: 

| distributed by the multiplier, 
-.0399038, end record product in 
credit column, (minus sign attached). 
(Variable 2) State of origin of 

| Questionnaire (determined by ad- 

| dress of sender.) 

|B, = .8861; Ws = 4.25473 

Only those states are given which 

| had one or more questionnaires in 

this investigation. 

17 Ol Alabama 
02 California 

8 03 Colorado 
04 Connecticut 
OS Delaware 
06 Dist. of Columbis 
07 Georgie 

5 08 Illinois 
09 Indiana 

8 10 Iowa 

1l 11 Kansas 

8 12 Kentucky 

10 13 Maryland 

9 14 Massachusetts 

14 15 Michigan 

7 16 Minnesota 

8 17 Missouri 

5 18 Montana 

7 19 Nebraska 

9 20 New Hampshire 

15 21 New Jersey 

8 22 New York 

3 25 North Carolina 

9 24 Ohio 

8 25 Oklahoma 

2 26 Oregon 
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* The rightmost of the two starred figures throughout is the original coded score, while the leftmost is the final 
quantified (and rectified) score, which multiplied by the gross score or W-weight yields the credit for the enswer 


at issue as recorded in the "Credit Points" column. 
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| Credit 
| Points 
t + 


Question 
| Number 


o 
e. 


29.78 
68.08 
29.78 


? Pennsylvanie 
Tennessee 
Texas 
Virginia 
\ashington 

4 32 Wisconsin 
anédas 
Variable Ll: Aveilabilit, 
ev uested 

-.6407. 

l Easy 
2 Possible 
% Hard 
4 Unreasoneble 

of Question Asked 
= 2.58169. 
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Yes-and-no 
Check 

check 
written an- 


Yes-and-no «nd 


Short or lon 
swers 

Yes-and-no and short or 
long written enswers 
Check «nd short or long 
written answers 

cond 


short or long written an- 


Yes-end-no, check, 
swers 


Varisble 12) Types of material re- 

ted. 6, = 5806. We = 5.49451. 

Ll. Count the number, from O to 3 of 
different objective types of mate- | 
rial requested: 
0. No objective deta 
l. Statistics 
2. Lists of names, schools, etc. 

3. Other objective data. 
Count the number, from 0 to 2 of 
different subjective types 


jues 


(Variable 1) 
neuire. 


9 
o 


Copy 
edit ‘redit 
Points Here 


O. No subjective data 
l. 


Discussion of 
opinion 


2. Opinion 


subjective 


Score according to following 


table:- 


of ot of 


No 
tive types 


0.¢ jec- subjec- 


tive types 


0 


Source of the question- 
Wo = 3.22812. 


Supt. of Schools 


24843. 


Po 
1 A City 
2A City 
in official capacity 
city teachers organizations) 
A university, college, or nor- 
mal school 
An association 
scope 
non-collegiate private school 
U.S. Government Agency 
state governmental agency 
county department of cduca- 
tion 
A business firm (including pub- 
lishers) 
A private individual, no of- 
ficial status given 


school employee «ritim 
(including 


or foundetion 
of national 


‘ 
A 
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Algebraic Total = 
(Predicted Per- 
centage Return) 
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THE REVERSIBILITY OF PROOF 











by 
W. Line 
and 
H. B. Hedman 
University of Toronto 
I, INTRODUCTION | Fig. 1 
| 
In a previous paper, (2) certain of | x y W Zz q Pp 
the basic theorems underlying Spearman's T sane 
factor theory were presented in a manner | x | - Se ‘Se Se Se "x 
that can be appreciated by those unacquainted) 
with the hicher mathematics. Among them Y | ty - Tye Ty: Tyq Typ 
was included the proof that when each of | 
four variables can be divided into two fac- w | ) Tye - . Taq Typ 
tors, the one being common to all four vari- 
ables, whilst the other is in each case si & Tyz a - Taq Trp 
specific and independent, the tetrad equa- | 
tion holds. We shall here take up the re- Q] Ns Tq Tog - % 
verse problem, namely, as to whether, when 
the tetrad criterion is satisfied, then P| lp Twp Tp Tap - 
every variable would necessarily be divisi- | 
ble into the said two factors. The origi- From equation 1 we get 
nal sdlutions were achieved by Garnett, (1) Tre 
and by Spearman, (3) The discussion which Ixy *———.F ys 
follows constitutes a simplification of Twa 
section I, 3 of the Appendix, (The Abili- Cm 
ties of Man, (4) ) and involves theorems Similarly r,, °——— oles s 
presented by Spearman in 1913 (5) and = 
1922 (3). These theorems are also restated lo 
as they are employed in the present paper. | and > = ee 
wz 


| where q, p, are other variables in an equi- 
proportional table of correlations, (1.e., 
a table in which the tetrad criterion is 
es fella: satisfied throughout). 
Therefore Txy = Ags -Tys, Where Axs is 
| a constant, whilst y takes all values ex- 
cept x or z. (Equation 6, Appendix, p.iii.) 
Our question, then, is tantamount to 
This is { » wv task is to sh ° ° 
ee wd = ee oe oe asking whether, on assuming equation 6, 


h | 
Se ae rately | mont aan iw 8 
necessarily sible 2 wee, Se | be reduced to the form: 


II, THE REVERSIBILITY OF PROOF 


The tetrad equation may be represented 


Tey oles = Taw -Tys = Tas -Tye (1) 


of which is common to all four variables, 
the other being specific. Reference to Fig.l 
will make the setting of equation 1 quite 
clear. 


a=f,g+d, (7) (Appendix, p.iii.) 





\. Appendix refers to the Appendix in the Abilities of Man, (4) except where otherwise indicated. 
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wnere 1. f., fp, etc., are constant for all | 
particular values of a, b, etc., 
2. g is an element common to all the 
variables, 
3. da, Gp, etc., are uncorrelated with 
g, | 
4. de, dp, etc., are uncorrelated with | 
each other. | 
These four "conditions" must be shown to be 
atisfied, if the "reversibility" theorem 
slAae 
With regard to conditions 1 and 2, any 
* the variables may be written in the form 


BR, Hedman 


-‘iven in equation 7, so as to satisfy these | But since the units are so chosen that 


‘onditions. Indeed, we may give to f, and 
to @ any values we please, so long as 


a. =a- f.8- 
The third condition demands that 
Tae (fae) ~ =" T(a-f9e) (fag) 


t(a-f,g)(f.e) = (deviation 


‘sta-f,e) x(f,e)| form) 


f,rag - f. xg 
c = 2 = 7 i 
2(a-f, 2) r(fag) |7 


og = 


| the third condition demands that 


0 
or that 1 * Megs 
i.e., that a=nr,.g + d,. 


The first three conditions can accord- 
ingly be satisfied for any set of variables 


| whatever. 


The fourth condition,--namely, that 


| de, dp, etc., are uncorrelated with each 


other ,--demands that 


O=r . 
a,4, 


Or, since a=f,g+d,, b= fp,e + dy, 


Now dg = a - fag, 
2 


2 de 
and Ca, = he 


2 2 2 
Hence t(a-f,¢) = td, = Nog,- 


The third condition consequently de- 
mands that 


_ 23g - fate" , 
Vtg? VNoa, 
But, by choosing the units so that 
Ca = = soeceee = OF = 1, 


the third condition demands that 





Os T(e-f,8) (b-fpe) 


2(a-fag) (b-fpg) (deviation 
jt form) 





E a-f,g)° b-f, 2) 


tab-f,rbe-fprag+f, f,re" 


[(2a2)(2a3)]? 





we choose our units so that 


O, = OO =90 5% Se, © 4; 


we may write this fourth condition thus: 
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Lab f,ibde f, Lag 
0 = _— I @ + 
(sa? rp? )* (rb? xr¢7)? (sa?zg*)? 


r, substituting the values for r;, 
found necessary in condition 3, 


ad ~- fafy - fpf, * fal 


* i — Palle sisccs (8) (Appendix, p. iv. 


ur task now becomes that of proving 
this equation to be true, in which case the 
fourth condition will be satisfied. 

In doine this, Spearman makes use of 
the equation for the correlation between 
sums. We shall accordingly interrupt our 
analysis of the "reversibility of proof" at 
this point, in order to show how the formula 
for determining the correlation between sums 
is derived. This takes us to a considera- 
tion of Spearman's contribution in 1913. (5) 


Correlation between Sums. Using the symbols 
employed by Spearman, let our task be to de- 
rive an equation expressing the correlation 
between the sum of a variables, and the sum 
of b variables; the number of a variables 
rangine from 1 to p, the number of b vari- 
ables ranging from 1 to q, with N cases in 


each variable. (See Fig. 2.) 
Fig. 
a group (scores) 
a a a a B, ccccee & 
a. 2, 3, 4, $s, Py 
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the scores of 


variables. 
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h variable, 


case npr 


b-variables, witt 


Let S represent the summation by row (hori- 
zontally); let £ represent the summatio 


column 


(vertically). 


Se expresses the sum of the a measures ir 
any) row, from the a, to the a, 
Similarly for 


row. 


a 


q 
Ope 


i 


of that 


Za expresses the sum of the a measures from 


1 to N in any a-variable. 


Np 


ie expresses 
P 
Sa's, and therefore the sum of al 


in all of the a-variables. 
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Bach Da, Day Ds, «-..- Dg 18 multiplied by a constant m,, m,, My, «.... m,, respectively. 
(It is, of course, obvious that the correlation between two variables is not affected by 
the introduction of a constant factor in each variable; for the deviations, and the sum 
of the squares of the deviations (standard units) are not thereby influenced. Thus 
M(n,8,) (™d,) - Ta,b,)- 

expresses the summation of any a-row, times n. 


né 


. ‘ "  " b-row, times m. 


3 
Co 


reno rQ(s 


M= eme 
(Ne “0 
bo] 
” 


" Nrows of a's, times n. 


DQ 


N rows of b's, times m. 


=} 
4 


~mM = 
eh » 
2 


"average summation of an a-row, times n. 


= 


=z 
2 


a b-row, times m. 


NP 
expresses the deviation of the Sua for any row from the average summation ive ° 
NqQ 
tSmb 
N 
The correlation between the sum of the na-variables and the sum of the mb-variables 
may now be expressed as follows: 


is a similar expression in the case of the b-group. 


(nya, + Nga + 


A (Numerator ) 
B (Denominator) 


Let as include every 
n de " w 


" Ns " " 
n " " 


Mt 


from 1 to N in every a-variable. 
a." = "  bevariable. 


n 
p° 
LJ 


& 
b 
a 
m 


Expression A (Numerator) may be written Bp “a 
N p >» \ q rS, 
Ns Mz (. - | ( a a. 

| N N 


us 


Pp 
(since the expression Su. includes all values of n from 1 to p, which we have called n,; 
and therefore ng, may be treated as a constant. Similarly for m, in the b-group. ) 
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Pp 
The aim is to state the above expression for the correlation between sums $, and Sy, 
(the pooled scores for individuals) in terms of correlations between the simple elements, 


such as Ta,b,? Ta, be? reeeeeseTarbos Pa,b,? Tagb? eeeeeees Fagg? *9°°9°oRa ne 
P 
Since $. for an individual = the sum of the values @,, @,, «.++..... 4) for that indi- 


q 
vidual, and Sp similarly = Bb, * Det..cccce* Dg» we may write the Numerator thus: 


N N ; 

Ha, + aat....tap) E(D, + be +....+b, 

(b, + b. Frcce* by) _ 
N J N 


[ 
u | 
named) (% + A, t..eet Ap) = 








If ‘ N N N N 

[de Em pap |{ fo Be 2h 

Ne Mt>) (a,- = eta E ae tlgeE) oo + (be-——) Pecoe* re | 
EON N N N N N 


= nemg(a, + Bp teeeeet A) (Dy + Dg teeeeeet d,)| 


(where 8,, eg ceeee a, and b,, b, ...... bg are expressed in deviations from the respec- 
tive means.) 


N 
= neMz |a,d, + a,b, F+rccet a,b, + a,b, + a,b, trcee®* a,b, Prccce® apb, + apb, + . 
+ and, 
eereee Pp q | 


= name| Fav,Now, 0% * Ta, by No, Hate +eeeeeet Taybg Na, r,* Ta ab, Now, %,* Te,b. Noa, Os teeeeet Tagbg 
Noa, %, t++++* Tay b, Nog, %,* Tap be Noa, Op teeeet Tayhg Nap] 
bal NgMtNog Sst Ta,d, eee eereeee COOPERS HEE EEE EEE EEO ES HEHE (A) 


(where S,; Ta dy indicates the sum of the correlations between every a and every b. There 
are pq such correlations. The expression q, indicates the constant standard deviation of 
each a. The standard deviations have been made equal. A similar interpretation holds 
for %,.) 
Expression A might also be written thus: 
Mp MtNog PD Q Tap  seeceecececceccccceccssecsessccssesessessssesessesesens (A') 














(where F,, is the average of all the pq correlations between the a and b variables.) 
The First Factor of the Denominator (B) may be written 











wp7lz 
tle 
Mstife ~ 

N 


rr 2a 
| (a, + Bg tecccccce® a») . 
$ (es + Be t..eeeee* Ap) = | (since S$, = a, + a&+....+ 4) 
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i 


L 


Ni 1 
m4 la, > Gy: Meecveecce* a} (where &,, &,, .... & are expressed as devia- 


>» 
i 
E tions from respective means) 


wifi + 72 et Stee 
N N 


a 
= ne|za? + ma Frccce® rap + 58,8, PTT TTT ted ra,a Prcccceo® ra.a,|? 
+ P P 


i 


— - 6 
ng [te + N Og teccecesecet Nog. * N Ta, a Oa, Og,*---- tN Taz ap%a, %e, + 
2 


+N Pasay a 2 "ap 
i 


n,N a2 Si %seect LOR, + + wal 


Ng\N o, (P + 2 Sq ig) Tao, ) eeeeeeeeeeeeeee ee eeeeeeeeeeeeeeeeeeeeee (C) 
(where 2 S(s)(s_i)+ Taga, , OXPresses the sum of the correlations between every a and every 
(p-1) of these correlations. The value S(s)(s1) + Tages; 18 doubled 


other a. There are p 
because each particular correlation is included twice, as will be seen from the follow- 


ing diagram: 


Expressed in terms of mean correlation, this becomes: 


2(p=1)8(5) (5-17, ¥ 2 

nai o,,17(2 es ¢ us y “ste-l = ng\No,, VP (a , (p=-1) a siainninscnin (Ct) 
p (p=1) / 

(where f,, expresses the mean of all the correlations between every a and every other a.) 
By a like procedure, the Second Factor of the Denominator may be reduced to 








meYN oy, (@ + 2 Sey(a) Teyoy_) 
i 
or to m,/N o, Va (1 + (q- 1) F,,)* eeeeeeoee eee eee e ea eee eee ee ee eeeeeeeneeene (D') 


Returning to the original expression for the correlation between sums, we may, 


therefore, now write: 
+ mPa) 


i= 





a 
[nV Ca, (p +2 S(s)(s-i)Ta,e, | ‘nal SD, (q +2 St (t-1) Tov.) | 
S. 
= “ eeeee Equation 2, p. 419 in Ref. 5, or Equation le, 
yp +28, ya * 2S, p. 424 in Ref. 5. 
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(where the elements have been reduced to equal standard deviations before combining them, 
and where Sr» has the same meaning as S, Ta,by? and Sr, has the same meaning as 
Sis)(o-1)Tagay_)) 2nd similarly for Sr,,). 

Expressed in terms of mean correlations, we have 


t 


My My N om om, Pa Tar 


oo + moa hi Aineni + mgdg ) = : ae se PRS, ; 
7 i atin “a Ng\N om, PUP~1)Faa )* || myYN op, Y(1+(q~-1)Fyy ) 


wily 


yPa Pap 
coccccce BQuation 1, p. 419, & 


~ Yl +(p-1)%, V1 +(q-2) ry, Equation 13, p. 425, in 
Reference 5. 


(where the elements have been reduced to equal standard deviations.) 

From Equation 2 (above), the case where the standard deviations of the a's, or of 
the b's are unequal, (i.e., where the elements have not been reduced to the same stand- 
ard deviation before combining), is easily provided for, as follows: The required value 
for the richt-hand member of the equation would be: 

S 
— a. BS =a. a _ccccee Equation 3, p. 419, 
yS(o, ) * 2 SO, O,Tae yS(o,) +2 S(o, 0,8 pp ) Ref. 5. 


It is easily shown that the correlation between sums of series is equal to the 
correlation between the averages of those series. In other words, the correlation be- 
tween the composite scores of a group of individuals is equal to the correlation between 
the averages of those scores. For example: 









































Np r N 
oS te | q er | 
z ye = i || Sv = 
ae | wie i } 
F (ey Fence? Gp)(Op Foceed bh ) ak & NP j3)2 Nq)}? > 
? | NiP 23a | Niq 11 
tis, - Po o f 
\e/9e | \z Sp 
e N a | N J ; 
P 3 q Nq ), 
N Ss as ° Sp tS \ 
2 |— . —||—. —| 
Pp Np q Nq || 
J 2 
( N { Na Jo. 
| 5. Pie | Sp 53,{*|2 
nw} 2 22 ( } yl2 aa] | 
4 — ou) f 4 ou ouanans > 
b>» « 1 z eo 
*| Pp Np}; | LQ 
\ fi J mt N , 
| $5, 55 
7 2 — i b > 
2)/Sa - || Se - w 
z U J } 
( = 2i¢ - N 22 
\s Se 55, . 
N 
3/58 - 35> - 1 f 
N | LL n | 














Warch, 1935 W. 


nis latter expression is obviously the 


Pp 
correlation between the average 5, 
i 





and the 






a 
average Sp.) 
2 











The correlation between sum; as given 
hy Spearman has been analyzed, with particu- 
lar emphasis upon the formulae basic to the 
ument of this paper. We may now return 
the "reversibility of proof." 

The fourth condition demanded that 
1 = Tap — Tag -Trgs (Appendix, p. iv, equa- 
see p. 218, this paper); and this 
jemand is readily seen to be satisfied in 
the case where the set of variables enter- 
ine into the table of coefficients is very 
large. The proof is as follows: 

Let there be a large number (m) of 
x-variables, ranging from variable a to va- 
riable z. These are represented in Fig. 3. 









tion 8s 









Fig. 3 








Group of x-variables 
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These variables are expressed, as before, 
in the form a = f,g + d,. The sum of all 
the x-variables, (m of them), between and 








including the limits x = a and x = Z may be 
x=2 

expressed S&(x); and this also constitutes 
x=a 






the most representative value for g,--the 
common overlapping of all the x-variables. 
The correlation of any particular variable 
a with the sum of the x-variables includ- 
ing a, (i.e., witha +b*+.... +2) ex 
presses, therefore, the correlation between 
a and g. Hence 
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ag ” Tie)(a+ b+ eecccece + 2)° 
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Tab 


Consequently, from Equation 1 (p. 222, 
above), 


x" is any other one. 
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Vy 1m 
Teg * 


V1+(1-1)F ee 


Fox! 


/1+(m-1)F gree 


(where x' is any one of the x-variables, and 
Then F,.,. expresses 
the average of the correlations between 


every x, (or x') and every other x, (or x").) 


simplifying this equation, we have 


Vm Ty” + a fraction 


\ m F on 
= = 


— (since m is very 
VM Tyiye 


large); 





Similarly, 
Tox! 
Tog 9 
fc... 
Therefore Pas 6 The = Fax! Tox! 
yin 


But, by the postulated tetrad equation, 
(where x and x* ex- 
In Tye, any X 


+ T yx = Pax . Tpx* , 
clude respectively a and b. 


| except a may be correlated with any other x 
| except b. 


The value r,, is thus excluded.) 


| Hence ra, . Ie = Lr. - Tye); 





1 


= “ * 
at l bx ) 


and therefore r,, - 2x, 


=———} [tree Fax) (Pin* Fox *)] 


Tax IT pat Ty Dl ax ~llax OT pet 


= Tex Pox* 
rn 
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But since m is large, this value approxi- 
mates to the value it would have if x and 
x* took all values quite independently of 
each other, 1.e., including a and b. 


Fan Se 
Hence ry, =—=—=_ =r, - Tye; and 


x bs 
Rix 
therefore 0 = raa,. (p. 6 above.) 


The fourth condition is therefore satisfied 
for a large number of variables. 


The four conditions necessary to the 
postulated “reversibility” have now been 
shown to be satisfied. In the case of con- 
dition 4, however, m was assumed to be 
large. That this condition is met without 
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THE NATURE OF VERBAL AND NON-VERBAL ABILITIES 


by 
Jess H. Edds 
Lincoln Memorial University 


Harrogate, 


sentered around the problem of differenti- 
able mental abilities with emphasis on the 
"seneral ability” contention of Spearman (7) | 
and his followers as opposed by Thorndike (1))) 
sno holds for “special abilities." Through- 
uut the study verbal ability is meant to 
embrace that capacity necessary in compre- 
nending the meaning of words both in and 

nut of context and in reading material for 
facts called for, and by non-verbal ability 
is meant that ability necessary in solving 
certain non-verbal problems and analogies 
not presenting words. The study has at- 
tempted to answer the following pertinent 
questions: 1. Are verbal and non-verbal 
abilities independent capacities or are 

they the same capacity tested differently? 
2. Do these abilities possess group fac- 
tors? 3. Is the group factor present in the 
non-verbal tests, also present in the verbal 
tests? 4. What is the relationship between 
these abilities and intelligence? 5. How 
should these abilities be weighted in pre- 
jicting school grades? 

The subjects were one hundred forty 
high school students in Peabody Demonstra- 
tion School during the school year of 1929- 
1930. Maturity as an influencing factor 
seems to have been negligible as is indi- 
cated by correlations involving chronologi- 
cal age. The median chronological age was 
fifteen years and five months. The tests 
used to measure the designated abilities 
were as follows: 


The interest of the present study was | 
| 





VERBAL TESTS 


Haggerty Reading (Sigma 3) 
Whipple College Reading 
Means Hard Opposites 


| Otis Self Administering Test, Form A. 





1. 
2. 
3. 
4. 


Inglis Vocabulary 


Tennessee 


NON-VERBAL TESTS? 


1. International Group Mental Test (Form B) 
2. Geometric Form Test 
3. Atkinson Group Test 


Mental ability was measured by the 

All 
tests were administered by the group method 
and scored by objective keys. The Interna- 
tional Group Mental Test (devised by 

E. A. Doll) is an arrangement of pictures 


| into many grouped series such that an item 
| in one series matches in some specific way 
| an item in the other part of the same series. 


The more advanced sections of the scale 
have a centrally located picture which 


| serves as a cue to similarity required. The 


subject matches the items by joining them 
with a pencil mark. 

The Geometric Form Test is a pencil- 
and=-paper test adopted from the Mimesota 
Mechanical Aptitudes Test. Each part of 
the test presents some geometric figure in- 
tact and also dissected. The problem is to 
draw lines through the completed figure il- 
lustrating how the different parts should 
be placed to make a similar figure. 

The Atkinson Group Test (devised by 
W. R. Atkinson) consists of four rows and 
four columns of the first sixteen letters 
of the alphabet. The problem is to make 
combinations (24 are possible) of four let- 
ters taking no two letters from the same 
row or colum, 


RESULTS 


A, Calculation.--Intercorrelations were 
calculated by use of the Pearson product- 
moment formula. These correlations are 
shown in Table I, page 226, with reliabil- 
ity coefficients underlined. 








1. Tests one and two of the Non-Verbal group were supplied by Joseph Peterson through the National Research Council. 
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TABLE I | 
INTERCORRELATIONS BETWEEN THE DIFFERENT TEST ITEMS | a one point difference in the second decima) 
: — eee SPS | figure is a serious deviation when analyzed 
hy : P : ee @ eo gr et comparatively. All of the 105 tetrads fe]) 
rl within Spearman's (7, footnote, p. 295) ul- 
l. Age .04—.03 —.01—.04—.01 .01 .06—.10 | timate criterion. It is concluded, then, 
2. Otis ‘92 .68 .54 .69 .66 .5i .38 .28 | that the tests may be thought of as having 
5. Haggerty Reading 01 68 .88 .€66 .46 .30 .86 | a central factor present in all of the Seven 
Se Se eee 4 00 .67 687 25.28 | tests plus a specific factor common to each 
5. Means Opposites -89 .66 .34 .25 .25 | ae 
6. Inglis Vocabulary .86 .34 .24 .28 | Single test and not present in the others. 
7. International Group Mental Test 86 .49 .47 This conclusion is based on the Spearman 
8. Forms -62 .40 | technique. 
9. Atkinson 84 


=—- Se C. Correlations between V and N-V, Ap- 
parently the specific factors involved have 


The reliability coefficients were obtained 
by the split halves method and may be con- little influence since their relationships 
sidered "high." The average for the verbal | are zero within their probable error cri- 
tests is .82 and that for the non-verbal |terion. It was possible to test this ob- 
tests is .84. The averages of the inter- servation by one other method, namely, the 
correlations between the four verbal tests, correlation between the scores on Verbal 
the three non-verbal tests, and the cervele- | and Non-verbal tests. In order to complete 
tions between the two groups of tests are this method, reduced scores for the verbal 


as follows: and the non-verbal tests were necessary, 

tg ery eee co SB | and by using Woodworth's (14) method the 
Non-verbal teStS .....eceees +42 | four verbal tests were reduced to one, re- 
Verbal with non-verbal tests .31 | ferred to as V; likewise the three non-ver- 


Sechneck (5) found an average intercor- bal tests were reduced to one set of scores, 
relation of .4920 between five verbal tests, | referred to as NeV. The correlation between 
-3383 between four numerical tests, and V and N-V was then found to be .26, which is 


.1441 between the verbal and the numerical rather low in comparison to correlations be- 
| tween verbal material (r = .63) or non-ver- 


groups. 
bal material (r = .42). This low correla- 
B. The Tetrad Difference Criterion. tion does not agree with an arithmetical 
The tetrad difference technique was applied average of the intercorrelations between 
to correlation coefficients in Table I, the tests, however the method is in common 
with the exception of chronological age and | use (2), (5), (4). The method referred to 
scores on Otis. One hundred five tetrad combine several scores into one by reducing 


differences were obtained, although 210 such! them all to standard scores. The lack of 
tetrads are possible if both plus and minus | correlation between V and N-V is likely due 


signs are assigned. The opposite sign is to specific factors, therefore it appears 
not given arbitrarily, but may be obtained that V and N-V are in a large measure dis- 
by arranging any four variables into six tinct, or at least not the same. The im- 


combinations instead of three. Pearson (4) plication of the study at this point is in 
supports the practice of doubling the number | opposition to the view of Spearman who holds 
of tetrads in order to get a smoother fre- that the general factor is the same through- 
quency curve. The present study made use of | out all mental functions, varying only in 





the 105 regular tetrads. M; = .0486 and the proportion that particular abilities 
PE, = .0524 when Spearman's (7, Appendix, make use of the central ability. Spearman's 
p. XI) formula was used for finding the later view admits the possibility of group 
probable error of the tetrads. This close factors. 

agreement of M, and PE, is, according to In view of the fact that there may yet 


Spearman, part of the tetrad criterion sat- | be group factors involved in the tests used, 
isfied, whereas Pearson (4) holds that even | one other criterion was applied. For this 
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purpose Kelley's (3) test for group factors 
yas employed. The statement is as follows: 
"If the intercorrelations between four va- 
riables are such that t,,;, Cases and 
tasae = O, they could conceivably have 
arisen from four variables X,, X,, X;, and 
through which was a general factor plus, 
, addition thereto, a second factor common 
X, and X, or a second factor common to 
and X,." Kelley's technique was applied 
the intercorrelations between Haggerty 
Reading, Means Opposites, International 
Mental Test, and Forms. If a designates a 
veneral factor common to all four variables, 
b a group factor common to x, and x,, and 
Si, Se, Ss, and s, the factors specific to 
the respective variables, the factor pat- 
tern may be represented as follows: 


+ Bad +718, 
62d +282 
+ 383 
+ Y4S4 


= a,a 
=a,a + 
= a;a 


x, (Haggerty Reading) 
x, (Means Opposites) 
x, (International M.T.) 


x, (Forms) = a,a 





If the variables in the above factor 
pattern are assumed to be in terms of stand-| 
ard deviation measures, the coefficient of 
correlation between a pair of variables is 
equal to the sum of the products of the 
factor loadings of the factor or factors 
common to the two variables. Applying this 
principle, the intercorrelations are as 
follows: 


(Haggerty and Means) 
(Hageerty and Interna- 
tional M.T.) 

(Haggerty and Forms) 
(Means and Internation- 
al M.T.) 

(Means and Forms) 
(International M.T. 

and Forms ) 


Tia a,a, * 6,6, 
Tis 
a,a3 
Tis = a,a, 
las 
aaa3 
T 24 = A204 
P34 
34 


Tetrad differences formed from the 
above correlations give: 


= a3046 62 
= a,0,6,6, 


0 


Cizse 
tizas 
Cisse 


Theoretically the first two tetrads 


| two of the variables. 
| to be shown below, an additional factor was 
| assumed in the verbal tests and a different 





equal each other and the third equals zero. 
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This method was applied to all combinations 
of the four variables, first assuming the 
presence of an additional factor in only 
But in the last case, 


additional factor common to the non-verbal 
tests but not present in the verbal group. 
These cases were calculated after the fash- 
ion of the above method. The actual tetrads 


| calculated from the four variables chosen 


are as follows: 


= .1856 
= .1871 
= .0015 


Ciase 
t 1243 


Cisse 


The first two tetrads may be considered 
equal to each other and the third one equal 
to Zero. Assuming a group factor (b) in 
Haggerty Reading and Inglis Vocabulary and 
an additional group factor (c) in Atkinson 
and Forms, the theoretical results are: 


= a3048, 62 + a,02636, + BiBaYs Yo 
a,4,6,6, + a,a,656, + 6, P.YsY, 


0 


tase 


tizes 


t 


1342 


| and the actual tetrads are: 


tiase = 1728 
= .1800 


tieess ad 


tisee = 0072 

Other combinations of the verbal and the 
non-verbal tests showed essentially the 

same thing. These satisfy the condition 

set up in Kelley's sixteenth proposition. 
Therefore, if all of the tests are thought 
of as having a common general factor, and 
an additional factor present only in the 
verbal tests, the tetrad differences agree 
with the prediction. If we consider the 
additional factor present only in the non- 
verbal tests, the prediction is also satis- 
fied. The theory is also satisfied if an 
additional factor is considered present in 
each of the two types of tests. It follows, 
then, that there is a group factor common to 
the verbal tests or to the non-verbal tests. 
There may even be two group factors; the 

one common to the verbal tests, the other 
common to the non-verbal tests. It seems 
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convincing, however, that there are group 
factors which do not cut across both sets 

of tests, and none of those examined seem 

to cut across both verbal and non-verbal ma- 
terial. 

The original correlation of .26 between 
V and N-V is further substantiated, since 
Kelley's criterion is satisfied. It is not 
concluded beyond all doubt that this is the 
case, but at the present time the converse 
of Kelley's proposition has not been 
proved, 

Since the Spearman technique has been 
employed, it seems advisable to review 
briefly some of its holdings and at the 
same time give the reaction of some of 
Spearman's contemporaries to his viewpoint. 
In 1904 Spearman (8) made his consistent ap- 
peal for “general intelligence." He con- 
cluded that there are “intelligences” and 
also a general universal ability. In 1914 
(9) he gave further evidence of the two fac- 
tor theory by making use of Simpson's and 
Thorndike's data. It was in this study 
that Spearman showed how tetrad differences 
conform to a regular frequency curve. Simp- 
son (6) in his original study, however, 
concluded against Spearman's theory which 
holds that intelligence is to be explained 
on the basis of a hierarchy of mental func- 
tions wherein the amount of correlation in 
each case is due to the degree of connec- 
tion with a common central factor. Pearson 
(4, pp. 289-290) points out that Spearman's 
frequency curves based on tetrad differences 
are by no means symmetrical when exact mathe- 
matics is applied. He finds Spearman's 
probable error in error .001, and the ob- 
served mean in error .005. Pearson further 
contends for an error in. Spearman and Hol- 
zinger's formula for calculating the prob- 
able error of tetrad differences. 

In a review of The Abilities of Man, 
Wilson (13) accuses Spearman of calculating 
intelligence instead of measuring it. Later, 
in a comment on Spearman's (10) "g" factor, 
Wilson (12) holds that the equation r,, = 1, 
also isnot a single equation, but as many 
equations as there are subjects taking the 
test. Quoting Wilson: "just as x* + y* 

+ 2* = 0, it forces x = 0, y= 0, 2 = 0." 
(12, p. 223.) 
Asher (2) gave certain tests to 805 
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college freshmen and found that tetrad dir- 
ferences fell within 3 P.E. His tests, cor- 
related with school marks, showed r = .605 
and r = .580 for 1926 and 1927, respectively, 
Asher concluded that if tests are made to 
depend on "g" a higher correlation is found 
between such tests and scholarship. 

Only a few representative discussions 
dealing with the two contradicting concep- 
tions of the nature of intelligence have 
been mentioned. It is seen that these views 
are not reconciled to each other. 


D. The Relation of V and N-V to Intel- 
ligence and Scholastic records. The low 
correlation, r = .26, between V ana N-V 
raises the question of relationship between 
these two measures and mental ability and 
school grades. Reduced scores were calcu- 
lated for school grades and for intelligence 
test scores. The correlation coefficients 
are as follows: 








1, Verbal ability Fr, = 
2. Non-verbal ability | r,, = .38 
3. Intelligence Te, = .42 
4. School grades Ta, = -40 


The subscripts are in keeping with the 
numbers for the tests. 

It is seen that V and N-V correlated 
practically the same with school grades. It 
may be said, then, that the non-verbal tests 
combined are as good a measure of success in 
school as a combination of the verbal tests. 
The higher relationship between V and intel- 
ligence may be real although the verbal 
elements in the Otis test are in part re- 
sponsible for the twelve points advantage. 

The foregoing results seem to justify 
the following tentative conclusions in an- 
swer to the questions raised in the begin- 
ning: 1. Verbal and non-verbal abilities 
seem to be rather different capacities show- 
ing low relationship to each other. 2. Either 
V or N-V contains a factor not present in 
the other. 3. A common group factor does 
not seem to be present to the same degree 
in both verbal and non-verbal material. 

4, Mental ability, as measured by the Otis 
S. A., correlates twelve points higher with 
V than with N-V. 5. V and N-V have prac- 


tically equal weight in predicting class 
scores. 
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MEASUREMENT OF INFANT BEHAVIOR? 


by 


Helen Thompson 
Clinic of Child Development 


Yale 


We tend to think of crowth in terms of 
increase and differentiation. The infant 
shows definite increase in his ability to 
attain the upright position, in the variety 
of tasks which he can perform, and in gener- 
al mastery of his environment. Scales for 
measuring growth accordingly have been com- 
posed of items of behavior of ascending or- 
der of difficulty and versatility as have 
been the tests for older children and 
adults. Pintner,* in his book on "Intelli- 
rence Testing", reflecting the ceneral atti- 
tude regarding mental measurement lists the 
criteria of a test of intelligence and in- 
cludes, "Criterion 2. Increasing ability at 


successive age levels", and says, “Obviously | 


if a test or scale fails to show this in- 
crease we can get no measure of the child.” 
Of course if a test is composed of elements 
which do not increase in frequency of occur- 
rence at ascending age levels, the total 
score accordingly will not increase. 
Careful and detailed study of infant 
behavior and its manner of development has 
revealed some further characteristics and 
trends which suggest that another method of 
evaluating growth levels should be devised 
which will utilize more fully the facts of 


University 


| development and thereby give us a more valid, 
| accurate, and finely graded measuring device, 
| The complete findings of the study are 

| detailed in a recent monograph.*® Here, it 
'will suffice to say that the group of in- 

| fants* studied was a highly homogeneous one 

| with respect to social-economic status and 
|race.® Age level distinctions were there- 

| fore defined even more sharply than they 
would have been had twice the number of 
cases been examined. 

For this study, the infant who had 
been brought to the clinic by his mother, 
was taken to the photographic dome or exam- 
ining room, and, placed nude on the crib 
platform, was presented in a specified man- 
| ner with the simple objects designed to 

elicit his behavior. The examiner,® who 
| stood by his left side, dictated in a natu- 
ral but subdued voice, the infant's general 
activity and response directed specifically 
to the stimulus. Naturally the examiner was 
watching for certain behavior which experi- 
ence had led her to expect, but she was al- 
so alert to any detail not previously ob- 
served, and furthermore, she was watching 
for any response which the infant might 
make, whether or not it conformed with the 
expected behavior. 








. Pintner, Rudolf, Intelligence Testing. 
. Gesell, A. and Thompson, H. assisted by Amatruda, C., 
Co., New York, 1954, 545 pages. 


orn Ee | 


4. At least 26 infants, 15 girls and 15 boys were examined at each age level. 


. This paper was presented at the Tenth International Congress of Psychology, August 1952. Copenhagen, Denmark. 
Henry Holt and Co., New York, 1923, p. 62. 
Infant Behavior: 


Its Genesis and Growth. McGraw-Hill Book 


With only a few exceptions the records 


were obtained within two days of the stated age; this precision also reduced the variability of the group and reli- 


ability of the findings. 
weight. 


No 111 or seriously underweight infants were included. 


All infants were of normal gestation period as indicated by the prenatal history and birth 


Cinema records of the behavior as well as the 


mother's reports of the infant's activity at home check and confirm the results. 
5. The subjects were all from homes characteristic of the middle social-economic status of the country. The races rep- 
resented were those of northern European extraction so far as this could be established from information concerning 


the nationality of the grandparents. 


6. Eighty-two percent of the examinations were made by the same two examiners, one who examined the infants from 4 
through 12 weeks and the other who examined them from 16 through 56 weeks. 
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| havior therefore indicated not what 
| fant 
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Head predominently rotated 
predominantly in mid position 
predominantly in t-n-r position 


Head 
Arms 
Arms predominently symmetrical 

Hands active in mutual fingering 

Grasps foot 

Legs extended and lifted more than briefly 
Rolls to side 


At least 60 percent of the infants at 
any age level had been examined in the same 
manner four weeks earlier. Because of the 
rapidity of growth, situations profitable 
for elicitine behavior at early levels were 
necessarily dropped as new and more fitting 
situations were introduced but this was done| 
so that as far as possible, the continuity 
of the examinations of the various age lev- 
els was preserved. 

Rehavior trends were verified by refer- 
ence to the cinema records made at the time 
of observation. 

Data from one situation will serve to 
typify the trends of development observed. 


| 
| 
| 
| 
| 
| 


The Supine Situation 

The nude infant was placed in supine 
position on the crib platform and his gener- 
al body posture and behavior was noted. No 
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further stimulation was afforded; the be- 
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the in- 
could do but what he did do and it in- 
jicated only what he did do in this specific 
situation during the time allotted. 

The percentage of the number of cases 
displaying each behavior item at each age 
level was determined. It was evident that 
the trend of development of any specific 
behavior followed one of four courses. 

(1) It increased in frequency of occurrence; 
(2) It decreased; (3) It increased to a cer- 
tain frequency and then decreased; and 

(4) It fluctuated. Items showing constant 
trends were rarely observed. 

The graph shows a sampling of the 
trends which were found. Head predominate- 
ly in midline is behavior seen with increas- 
inc frequency. The curve rises sharply be- 
tween 12 and 16 weeks; after that it indi- 
cates frequency so common that the behavior 
is no longer significant. At early ages 
the head is predominately rotated and this 
turnine of the head furnishes the stimulus 
for the tonic neck reflex posture of the 
arms. These two aspects of behavior are 
related but not interdependent. The prom- 
inent tonic neck reflex position of the 


arms is seen with great frequency early but 
jisappears by 20 weeks, but not because the 
head is no longer predominately on the side; 
when the head is turned to the side, at 20 
weeks the infant either maintains a sym- 
metrical position of his arms and legs or 
he rolls to the side. The rolling to the 
side at 4 weeks is brought about by the 
rounded back of the infant, together with 
considerable almost continuous and often 
abrupt movements which places his center of 
gravity beyond the small area of contact 
with the supporting surface and he conse- 
quently rolls to the side. As the back is 
less rounded the activity decreases but in- 
creases sharply again but in a different 
pattern, as the infant, swinging the legs 
and turning the head, rolls to the side. 
With more careful study two items could be 
specified. This is probably true for all 
items which show a fluctuating trend. It 
is possible, however, on another basis. to 
appraise this fluctuating behavior and 
therefore such an analysis has not been 
made. As the head assumes the midline po- 
sition and the tonic neck reflex disappears, 








———— 
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the arms which have been externally rotated 
internally and with semi-flexion at the el- 
bow, the hands come together; mutual finger- 
ing results, but this activity disappears 
when the arms are further extended and 
reach down to grasp the feet which, by 
flexion of the legs at the hip joint, have 
been extended and lifted into the line of 
vision. 

The changing complexity of development 
is at once apparent when the trends of be- 
havior items are thus studied. Behavior 
growth may be properly regarded as an or- 
ranized and systematic changing complex. 
This does not oppose the view that devsalop- 
ment is signified by increase and differen- 
tiation in ability, instead it emphasizes 
the intricacy of the differentiation and in- 
dicates the importance of responses which do 
not align themselves simply with increasing 
abilities. When an infant is placed in the 
dorsal position and brings his hands to- 
gether clasping them, at once his behavior 
indicates between 8 and 32 weeks maturity. 
This surely has more significance concern- 
ing developmental age than the increasing 
item, head predominately in midline which 
when observed portends only the limit of 
four weeks. For the same reason when hands 
catch feet is observed it indicates the 
stage of development reached more definite- 
ly than would any more permanent item. True, 
we cannot attach adverse significance to 
the absence of this behavior but that is 
not adequate reason for disregarding its 
significance when it is seen. 

It is important to note that while 
each item reflects other aspects of activ- 
ity, complete relationship does not exist 
except when the possibilities of response 
are necessarily dichotomous. For instance, 
while prominent tonic neck reflex position 
assumes dissymmetry of the arms, prominent 
dissymmetry of the arms does not necessarily 
imply presence of the tonic neck reflex. The 
multiplicity of items is not the result of 
listing different aspects of the same re- 
sponse. Each item is unique. 
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Naturally the question is raised 
whether most items could not be worded or 
arranged so that increasing trends would be 
represented. To change the item prominent 
tonic neck reflex attitude into an increas- 
ing item and preserve its import could be 
done but this would mean neglecting what is 
seen and giving value to behavior not seen, 
It is preferable to evaluate what we see 
rather than what we do not see. Further- 
more the presence of this behavior is mean- 
ingful. It indicates maturity of less than 
20 weeks. 

To arrange the nodal items in an as- 
cending series would be to introduce an ar- 
tifact which is scarcely justified; credit 
could not be given for mutual fingering be- 
cause grasps foot is observed; also a minus 
score for failure to catch hold of the feet 
would not be justified when, at its peak at 
32 weeks, it is seen in only 35 percent of 
the infants whom we have every reason to 
believe represent the normal. It would be 
fully as reasonable to give it a plus rat- 
ing when not observed because 65 percent of 
the cases did not respond in that way! 

The graph is not misleading when it 
shows that only the minority of behavior 
items steadily increase in frequency. When 
behavior is minutely studied, we find it so 
full of individual patterns that 100 percent 
frequency of any specific response is rela- 
tively infrequent even in healthy normal 
children. It is not the presence or ab- 
sence of any one item of behavior but rath- 
er it is the total complex which precisely 
indicates the child's stage of development. 
The nodal, fluctuating and decreasing items 
are entities of behavior which have unique 
importance from a diagnostic and prognostic 
viewpoint. 

Methods of scoring are being worked 
out which will enable us properly to assess 
such behavior. The details of mathematical 
treatment are still in the process of for- 
mulation and are therefore not ready for 
presentation. Several possible methods are 
being tried out. The problem is by no means 
an insurmountable one. 
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THE CHOICE OF CENTRAL TENDENCY 
by 


Edward A. 


Lincoln 


Harvard University 


| 


Most textbooks on educational statis- 
tics discuss to some extent the factors 
that should be considered when a choice of 
central tendency is made for use ina par- | 
ticular investigation. No book, however, 
tives anything like a complete list of the 
advantages and disadvantages of the various 
averages, and so the writer has gathered for 
his students, and with their help, the fol- 
lowing statements. 

Yule, in his Introduction to the The- 





teria which should be considered when any 
problem in the choice of central tendency 
arises. These criteria are set down under 
the first heading. 

The other sections deal with the ad- 
vantages and disadvantages of the various 
averages. These were found, either definite- 
ly or by implication, in the texts, or were 
brought out in class discussion. The most 
helpful books were King's Elements of Sta- 
tistical Method, and Garrett's Statistics 
in Psychology and Education. 

A study of the statements should make 
clear the fact that no one measure of cen- 
tral tendency can be called the best, with- 
out qualification. The various averages 
tell different facts about the series of 
measures which they are used to represent. 
Sometimes one of these facts is described, 
sometimes another, sometimes a combination 
of two or more. 

The lists of advantages and disadvan- 
tages also reveal that the same quality or 
attribute which may be an advantage in one 
problem, or phase of a problem, is likely 
to be a disadvantage in other circumstances. 
Thus, if it is desired to give weight to 
the extreme measures, it is necessary to 
use an average which does this. If the ef- 
fect of extremes is to be minimized, a dif- 
ferent type of average must be used. The 











fact that the geometric and harmonic means 
have special applications is both advan- 
tageous and disadvantageous. 

Another general consideration is im 
portant. If, as is frequently true, the 
investigator wishes to compare his own re- 
sults with those of previous studies, he 
must usually employ the same measure of 
central tendency as was previously employed. 
Perhaps the most common application of this 
principle is found in the handling of stand- 
ard test results. The makers of the tests 
have practically all adopted the median as 


| the measure of central tendency in which to 
| express the norms or standards. 


Thus, when 
it is desired to compare the abilities of 
classes or other groups with norms, it is 


| necessary to compute the medians. 


FACTORS FOR CONSIDERATION IN 
THE CHOICE OF CENTRAL TENDENCY 


A. Yule's Criteria for Averages. (p.108ff.) 
1. The average should be rigidly defined, 

and not left to the mere estimation 

of the observer. 

It should be based on all the observa- 

tions made. 

It should not be of too abstract 


2. 


3. 
mathematical character. That is, it 
should be readily understood. 

It should be calculated with reason- 
able ease and rapidity. 

It should be as little affected as 
possible by the fluctuations of sam- 
pling, that is, it should be stable. 
(This criterion applies only when the 
central tendency is used as a basis 
of generalization. It has no point 
if the investigator is concerned with 
the data actually at hand.) 

It should lend itself readily to al- 
gebraic treatment. 
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- Advantages of the Arithmetic Mean. 
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1. It meets all of Yule's criteria satis- | 


factorily. 
2. Its calculation does not require the 
arrangement of the data in any par- 
ticular way. 
[It gives weight to extreme var tations, | 
which is desirable in some cases. 
It may be calculated when only the 
number of items and their agreregate 
sum are known, 1.e., when information 
concerning the separate items is not 
available. 


. Disadvantages of the Arithmetic Mean. 


It cannot be determined by inspection 
from a graph or frequency table. 

It cannot be determined accurately 
when the extreme measures of a series 
are missing. 

It emphasizes extreme variations, 
which is often undesirable. 

It cannot be used in the study of in- 
commensurable quantities. 

5. It may fall where no data actually ex- 
ist. 


l. 


ie 
« 


Advantages of the Median. 

It is fairly rigidly defined. 

- It is based on all the measures. 

- Though not as familiar a measure as 
the mean, it is easy to explain and 
understand, 
It is easy to calculate. | 
It does not give excessive weight to | 
extreme cases, which is usually desir- | 
able. 
It can be used when only the number of | 
the extreme items is known, even if 


G. 
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the mean. 

It cannot be used when it is desirable 
to give weight to the extreme measures, 
It is not susceptible to algebraic 
treatment. 

It may fall where there are few or n 
cases. 

A correct total of the measures cannot 
be found by multiplying the median by 
the number of cases. 


Advantages of the Mode. 

1. It is very easily found from a dis- 
tribution table. 

2. It is not affected by extreme meas- 
ures. 

3. It cannot fall where no measures ex- 
ist. 

4. It is often the best representation 


of the group. 


Disadvantages of the Mode. 

It is not rigidly defined. 

It is not based on all the measures. 
It is the least reliable or stable of 
all the averages. 

It does not lend itself to algebraic 
treatment. 

Often no well-defined mode is present 
in a distribution. 

A correct total of the measures can- 
not be obtained by multiplying the 
mode by the number of cases. 

It may be determined by comparatively 
few items. 

It can only be found by a special ar- 
rangement of the cases in a distribu- 
tion table, 


8. 


their exact magnitude cannot be ascer- | H. Advantages of the Geometric Mean. 


tained. 
It can be used when the trait or qual-| 
ity being studied is not susceptible | 
of measurement in definite units. Thus 
it is possible to array a group of 

children according to their status in 


some trait and find a median. 


Disadvantages of the Median. 

1. It can only be found by a special ar- 
rangement of the measures either in a 
distribution table or in serial order. 
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It is not so reliable or stable as 


1. It is rigidly defined mathematically. 

2. It is based on all the measures. 

3. It is fairly stable or reliable. 

4. It is the only average which may be 
used legitimately in some cases. 


Disadvantages of the Geometric Mean. 
1. It is not readily understood. 

2. It is hard to calculate. 

3. It is useful only in special cases. 


Advantages of the Harmonic Mean. 
1. It is rigidly defined mathematically. 
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t is based on all the measures. M. Disadvantages of the Midscore. 
t is fairly stable or reliable. 1. It is not rigidly defined. 
. It is the only average which may be It is not as reliable or stable as 
used leritimately in some cases. the arithmetic mean. 
It can be found only by a special ar- 
isadvantages of the Harmonic Mean. rangement of the measures. . 
1. It is not readily understood. It does not give weight to the extreme 
2. It is hard to calculate. measures. 
3. It is useful only in special cases. It is not susceptible to algebraic 
treatment. 
Advantages of the Midscore. | A correct total cannot be obtained by 
1. It is based on all the measures. multiplying the midscore by the num 
2. It is readily understood. ber of cases. 
3. It is easy to calculate for a small It cannot be found when only the num 
rroup, like a single class in school. ber of cases and their total mgni- 
It does not give excessive weight to tude is known. 
extreme measures. 
It can be used even if the exact mag- 
nitude of the extreme cases is not 
known. 


T 
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A VOCABULARY GRADE PLACEMENT FORMULA 
by 
Alfred S. Lewerenz 


Assistant Supervisor 
Los Angeles Public Schools 


Educators are becoming more selective 
in their choice of texts and supplementary 
books. Price and authorship are not the 
only factors entering into choosing a book 
for a given ‘grade level. Such items as 
method of presentation, organization, i1l- 
lustrations, indexing, format, and vocabu- 
lary are now being given careful considera- 
tion. This note is concerned only with the 
measurement of the vocabulary content of a 
given book. The results of vocabulary 
studies should be considered as but a part 
of the information needed in selecting a 
text. Consequently, undue emphasis should 
not be given to measures of (1) vocabulary 
difficulty, (2) vocabulary diversity, and 
(3) vocabulary interest. 

The basis of the technique for deter- 
mining measures of difficulty and diversity 
is a 20" x 28" sheet. One side presents 
the 500 most important words in the English 
language in an alphabetical arranrement with 
spaces for writing in other words. The 
words in a sample of 1000 running words 
from a textbook are checked or recorded on 
this form. From this record, one can easi- 
ly obtain for each alphabetical group of 
words (1) the number checked in the 500 
word list, (2) the number of worces written 
in, and (3) the total number of words. From 
these data, measures of vocabulary diffi- 
culty and vocabulary diversity are obtained. 
These are translated into grade placements. 
Directions for these steps are given on the 
other side of the tabulation sheet. 

Vocabulary difficulty is a measure of 
the technical or special meaning words used 
by the author. Books rating high in vocab- 
ulary difficulty will contain many words of 
low frequency that are frequently derived 
from Greek and Latin sources. The difficul- 
ty grade placement secured indicates the de- 
gree of reading comprehension needed as 





measured by a standardized reading test. 
| the basis of five years of experience the 
“norms for vocabulary difficulty were revised 
| in February, 1935. The grade placements 
| have a reliability of .93. 
| Vocabulary diversity is a measure of 
| the variety or range of words used without 
respect to their difficulty. It indicates the 
| verbosity of wordiness of an author. The vo- 
| cabulary diversity grade placement is sec 
| ondary to the vocabulary difficulty grade 
| placement. Its chief use is in comparing two 
| or more books that have approximately the 
same vocabulary difficulty grade placement. 
| It is generally true that popular stories 
| are low both in vocabulary difficulty and 
diversity. High-grade literature tends to be 
low in difficulty but high in diversity. Sci- 
| entific books generally are high in both dif- 
ficulty and diversity. The grade placement 
of popular literature seems to be between 
fifth and sixth gerade difficulty. 

Vocabulary interest is a measure of pic- 
ture or image bearing words used. Books that 
use few colorful, sensory words are apt to 
lack interest. Books that children read with 
great delight contain a relatively high per- 
centage of image bearing words. The measure 
of vocabulary interest is now undergoing ex- 
pansion which will serve to increase the re- 
liability through securing a better sampling. 

The technique is useful in selecting 
books within the reading comprehension lev- 
el of students. Particularly is this true 
with texts for use with dull over-age pu- 
pils who have mature reading interests and 
a low comprehension level. In high school 
it is possible to recommend texts in the 
same fields where the students are taught 
according to their mental ability. The 
technique is one more aid to administrators 
in adapting the school to the child. 























